Commit Graph

2288 Commits

Author SHA1 Message Date
Jason Ekstrand 63101177f3 nir: Add another index to load_uniform to specify the range read
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-14 15:59:33 -07:00
Jason Ekstrand 769b5614f8 nir/opt_algebraic: Remove the encoding line
This is an unneeded diff between the vulkan and master branches
2016-04-14 10:35:40 -07:00
Jason Ekstrand c34be07230 spirv: Move to compiler/
While it does rely on NIR, it's not really part of the NIR core.  At the
moment, it still builds as part of libnir but that can be changed later if
desired.
2016-04-14 10:28:47 -07:00
Jason Ekstrand bfa3a38280 nir: Remove some pointless delta between vulkan and master 2016-04-14 10:24:33 -07:00
Jose Fonseca feb6732e80 nir: Use _snprintf on Windows.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-14 16:38:37 +01:00
Jose Fonseca ba0c0e3940 nir: Avoid structure initalization expressions.
Not supported by MSVC, and completely unnecessary -- inline functions
work just as well.

NIR_SRC_INIT/NIR_DEST_INIT could and probably should be replaced by the
inline functions.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-14 16:38:37 +01:00
Jose Fonseca 8f96524f13 nir: Remove unistd.h include.
It doesn't seem needed, and is not available on MSVC.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-14 16:38:31 +01:00
Jose Fonseca f8e2f1fba5 nir: Avoid empty {} struct initializer.
Not supported by MSVC and consistent through NIR.

[Emil Velikov: rebase]
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-04-14 16:33:52 +01:00
Jason Ekstrand 12f88ba32a Merge remote-tracking branch 'public/master' into vulkan 2016-04-13 20:25:39 -07:00
Jason Ekstrand b63a98b121 nir/dead_variables: Configurably work with any variable mode
The old version of the pass only worked on globals and locals and always
left inputs, outputs, uniforms, etc. alone.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-13 15:45:10 -07:00
Jason Ekstrand 4455bfa9a0 nir/algebraic: Add lowering for ldexp
The algorithm used is different from both the naive suggestion from the
GLSL spec and the one used in GLSL IR today.  Unfortunately, the GLSL IR
implementation that we have today doesn't handle denormals (for those that
care) or the case where the float source is +-inf.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-13 15:44:19 -07:00
Jason Ekstrand 745b3d295e nir: Add more modulus opcodes
These are all needed for SPIR-V

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-13 15:44:00 -07:00
Jason Ekstrand dd616cab01 nir/lower_io: Allow for a full bitmask of modes
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-13 12:44:10 -07:00
Jason Ekstrand 2caaf0ac5e nir/lower_indirect: nir_variable_mode is now a bitfield
Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-13 12:44:07 -07:00
Jason Ekstrand ffa0e12e15 nir: Convert nir_variable_mode to a bitfield
There are several passes where we need to specify some set of variable
modes that the pass needs top operate on.  This lets us easily do that.

Acked-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-04-13 12:40:12 -07:00
Jason Ekstrand 8f3b516f2e nir/clone: Copy bit size when cloning registers
Reported-by: Mark Janes <mark.a.janes@intel.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2016-04-12 16:41:58 -07:00
Ian Romanick 193a5cee6a nir: Fix typo in comment
Trivial.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-11 19:24:19 -07:00
Markus Wick 18c8b927e2 nir: Merge redudant integer clamping.
Dolphin uses them a lot. Range tracking would be better in the long term,
but this two lines works fine for now.

Signed-off-by: Markus Wick <markus@selfnet.de>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 18:48:50 -07:00
Kenneth Graunke 808d26c771 nir: Silence unused "options" warning in algebraic passes.
Some passes may not refer to options->..., at which point the compiler
will warn about an unused variable.  Just cast to void unconditionally
to shut it up.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-11 18:44:08 -07:00
Kenneth Graunke 5886cd79a0 nir: Do basic constant reassociation.
Many shaders contain expression trees of the form:

    const_1 * (value * const_2)

Reorganizing these to

    (const_1 * const_2) * value

will allow constant folding to combine the constants.  Sometimes, these
constants are 2 and 0.5, so we can remove a multiply altogether.  Other
times, it can create more immediate constants, which can actually hurt.

Finding a good balance here is tricky.  While much more could be done,
this simple patch seems to have a lot of positive benefit while having
a low downside.

shader-db results on Broadwell:

total instructions in shared programs: 8963768 -> 8961369 (-0.03%)
instructions in affected programs: 438318 -> 435919 (-0.55%)
helped: 1502
HURT: 245

total cycles in shared programs: 71527354 -> 71421516 (-0.15%)
cycles in affected programs: 11541788 -> 11435950 (-0.92%)
helped: 3445
HURT: 1224

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-11 18:43:55 -07:00
Jason Ekstrand a9e6213edd nir/lower_system_values: Add support for several computed values
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-11 13:53:03 -07:00
Emil Velikov 3d67780b80 compiler: remove {glsl,nir}/Makefile.sources
No longer used as of last commit.

v2: Rebase.

Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
2016-04-11 19:08:23 +01:00
Jason Ekstrand 3aa1a5ee88 nir/lower_system_values: Simplify the computation of LocalInvocationIndex 2016-04-10 23:43:38 -07:00
Connor Abbott a89c474157 nir: add a pass for lowering (un)pack_double_2x32
v2: Undo unintended change to the signature of
    nir_normalize_cubemap_coords (Iago).

v3: Move to compiler/nir (Iago)

v4: Remove Authors from copyright header (Michael Schellenberger)

v5 (Sam):
- Use nir_channel() and nir_ssa_for_alu_src() helpers (Jason)
- Inline lower_double_pack_instr() code into lower_double_pack_block()
  (Jason).
- Initialize nir_builder at lower_double_pack_impl() (Jason).

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Connor Abbott 663e6421df nir: add split versions of (un)pack_double_2x32
v2 (Sam):
- Use uint64 instead of float64 for sources and destinations. (Connor)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Connor Abbott b093808d26 nir: don't try to scalarize unpack_double_2x32
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Connor Abbott 9e31e0a21b nir: add support for (un)pack_double_2x32
v2 (Sam):
- Use uint64 instead of float64 for sources and destinations. (Connor)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Iago Toral Quiroga d5d6260329 nir: add i2d and u2d opcodes
v2:
- Assert supports_int and don't fallback to nir_fmov (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Iago Toral Quiroga b16d06252e nir: add d2i, d2u, d2b opcodes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Connor Abbott a4bce07dc6 nir: add support for d2f and f2d
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Iago Toral Quiroga fab5d4cd95 nir/glsl_to_nir: set bit_size on ssbo_load result
v2 (Sam):
- Add missing bit_size assignment when ssbo_load destination is a boolean.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Samuel Iglesias Gonsálvez a741378cb5 nir/glsl_to_nir: add bit-size info to add_instr()
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:28:01 +02:00
Connor Abbott 4b37c64f3b nir/split_var_copies: handle doubles
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Connor Abbott 106a1b5501 nir/instr_set: handle 64-bit bit-sizes
v2: Revert spurious change in nir_opt_cse.c (Iago)

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Connor Abbott f2ccb63be1 nir: handle doubles in nir_deref_get_const_initializer_load()
v2 (Sam):
- Use proper bitsize value when calling to nir_load_const_instr_create()
  (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Connor Abbott 41c2541fc7 nir/print: add support for printing doubles and bitsize
v2:
- Squash the printing doubles related patches into one patch (Sam).

v3:
- Print using PRIx64 format: long is 32-bit on some 32-bit platforms but long
long is basically always 64-bit (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Connor Abbott f5551f8a8b nir/glsl_to_nir: support doubles
v2:
- Don't set sized types to the destination of texture related opcodes.
  (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Iago Toral Quiroga 8e69782e3e nir/lower_load_const_to_scalar: support doubles and multiple bit sizes
v2 (Sam):
- Add assert to detect bitsizes differents than 32 and 64 (Jason).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:05 +02:00
Iago Toral Quiroga 12f628adcb nir/lower_to_source_mods: Handle different bit sizes
v2 (Sam):
- Use helper to get base type from nir_alu_type.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Samuel Iglesias Gonsálvez 3663a2397e nir: add bit_size info to nir_load_const_instr_create()
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Connor Abbott a5b17ae745 nir/lower_vec: adapt to different bit sizes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Samuel Iglesias Gonsálvez e3edaec739 nir: add bit_size info to nir_ssa_undef_instr_create()
v2:
- Make the users to give the right bit_sizes as arguments (Jason).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Connor Abbott 41a39e3384 nir/locals_to_regs: adapt to different bit sizes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Connor Abbott 40d1b671a9 nir/from_ssa: adapt to different bit sizes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:27:04 +02:00
Jason Ekstrand 7d58cfa366 nir: Add a pass for gathering various bits of shader info
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-10 20:43:47 -07:00
Jason Ekstrand b8f3909b73 nir/gather_info: Handle discard_if
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
2016-04-08 15:58:36 -07:00
Jason Ekstrand e26a978773 Merge remote-tracking branch 'public/master' into vulkan 2016-04-07 16:56:34 -07:00
Kenneth Graunke 3babb7b0a4 nir: Use PRIi64 and PRIu64 instead of %ld and %lu.
%ld and %lu aren't the right format specifiers for int64_t and uint64_t
on 32-bit (x86) systems.  They're %zu on Linux and %Iu on Windows.

Use the standard C99 macros in hopes that they work everywhere.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-04 14:38:48 -07:00
Jason Ekstrand eb93d6dec8 nir/search: Don't match inexact expressions with exact subexpressions
In the first pass of implementing exact handling, I made a mistake with
search-and-replace.  In particular, we only reallly handled exact/inexact
on the root of the tree.  Instead, we need to check every node in the tree
for an exact/inexact match.  As an example of this, consider the following
GLSL code

precise float a = b + c;
if (a < 0) {
   do_stuff();
}

In that case, only the add will be declared "exact" and an expression that
looks for "b + c < 0" will still match and replace it with "b < -c" which
may yield different results.  The solution is to simply bail if any of the
values are exact when matching an inexact expression.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2016-04-04 13:48:10 -07:00
Jason Ekstrand fe247bbe92 nir: Stop double-printing function arguments 2016-04-04 12:10:20 -07:00
Jason Ekstrand cc1320220f nir/gather_info: Add an assert for supported stages 2016-04-01 15:44:43 -07:00
Jason Ekstrand ebb0bcc11d nir: Move variable_get_io_mask back into gather_info
It used to be in nir_gather_info.c until I moved it out to nir.h so it
could be re-used with some linking code that never got merged.  We'll move
it back out if and when we have real code to share it with.
2016-04-01 15:39:48 -07:00
Jason Ekstrand 95106f6bfb Merge remote-tracking branch 'public/master' into vulkan 2016-04-01 15:16:21 -07:00
Jason Ekstrand de60e250f5 nir: Add an opcode for stomping a 32-bit value to 16-bit precision
This correlates directly to the SPIR-V opcode OpQuantizeToF16

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-01 13:52:28 -07:00
Ian Romanick 08ff5f4d1f nir: Simplify a bcsel to logical-or
Oddly, this did not affect the shader where I first noticed the pattern.
That particular shader doesn't get its if-statement converted to a bcsel
because there are two assignments in the else-statement.  This led to me
submitting https://bugs.freedesktop.org/show_bug.cgi?id=94747.

shader-db results:

Sandy Bridge
total instructions in shared programs: 8467384 -> 8467069 (-0.00%)
instructions in affected programs: 36594 -> 36279 (-0.86%)
helped: 46
HURT: 0

total cycles in shared programs: 117573448 -> 117568518 (-0.00%)
cycles in affected programs: 339114 -> 334184 (-1.45%)
helped: 46
HURT: 0

Ivy Bridge / Haswell / Broadwell / Skylake:
total instructions in shared programs: 7774258 -> 7773999 (-0.00%)
instructions in affected programs: 30874 -> 30615 (-0.84%)
helped: 46
HURT: 0

total cycles in shared programs: 65739190 -> 65734530 (-0.01%)
cycles in affected programs: 180380 -> 175720 (-2.58%)
helped: 45
HURT: 1

No change on G45 or Ironlake.

I also tried these expressions, but none of them affected any shaders in
shader-db:

   (('bcsel', a, 'a@bool', 'b@bool'), ('ior', a, b)),
   (('bcsel', a, 'b@bool', False),    ('iand', a, b)),
   (('bcsel', a, 'b@bool', 'a@bool'), ('iand', a, b)),

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-03-31 14:59:36 -07:00
Matt Turner 05ee6627d6 nir: Fix typo from commit 6702f1acde. 2016-03-30 19:18:35 -07:00
Matt Turner 6702f1acde nir: Propagate negates up multiplication chains.
total instructions in shared programs: 7112159 -> 7088092 (-0.34%)
instructions in affected programs: 1374915 -> 1350848 (-1.75%)
helped: 7392
HURT: 621

GAINED: 2
LOST:   2
2016-03-30 13:12:34 -07:00
Jason Ekstrand cf2257069c nir/spirv: Set a default number of invocations for geometry shaders
The SPIR-V spec says geometry shaders are supposed to have one invocation
by default.  The execution mode is only required if there are multiple
invocations.
2016-03-29 20:30:27 -07:00
Jason Ekstrand 35e2e96b30 nir: Add a helper for getting the current block from a cursor
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand be98c47528 nir/lower_out_to_temp: Add an "entrypoint" parameter
Previously, the pass assumed that the entrypoint would be whatever function
happened to have the name "main".  We really shouldn't trust in the
function names.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand 31a5bec93f nir/lower_out_to_temp: Steal the output's constant initializer
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand 38de85f9a5 nir: Add a helper for getting the unique function in a shader
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand 49be812be6 nir/sweep: Sweep function parameters
They are no longer in the list of local variables so we need to explicitly
sweep them.

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand 1be4c61c95 nir/builder: Add a helper for creating undefs
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand 6a2479d618 nir/builder: Add a helper for storing to variable derefs
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand 77e2ac1da7 nir/builder: Add a helper for building fdot instructions
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand da422663a6 nir: Add a variable_foreach_safe helper
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand 731870fbe3 nir/Makefile: Fix alphabetization
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-03-28 18:32:48 -07:00
Jason Ekstrand 433cf90650 nir/spirv: Remove the NoContraction hack
NIR now just handles this for us by not fusing if the multiply is marked as
exact.
2016-03-28 13:07:39 -07:00
Jason Ekstrand 035f66025b nir/search: Don't match inexact expressions with exact subexpressions
In the first pass of implementing exact handling, I made a mistake with
search-and-replace.  In particular, we only reallly handled exact/inexact
on the root of the tree.  Instead, we need to check every node in the tree
for an exact/inexact match.  As an example of this, consider the following
GLSL code

precise float a = b + c;
if (a < 0) {
   do_stuff();
}

In that case, only the add will be declared "exact" and an expression that
looks for "b + c < 0" will still match and replace it with "b < -c" which
may yield different results.  The solution is to simply bail if any of the
values are exact when matching an inexact expression.
2016-03-28 13:07:39 -07:00
Jason Ekstrand fbb9e1f008 spirv/alu: Add support for the NoContraction decoration 2016-03-25 21:35:41 -07:00
Jason Ekstrand 00fa795cd3 spirv/glsl: Add a helper for converting glsl opcodes into nir opcodes
This is similar to the way that regular ALU operations are handled.
2016-03-25 21:35:41 -07:00
Jason Ekstrand 98522c1853 nir/spirv: Get rid of the spirv2nir helper binary
This was useful once upon a time but now that we have a real Vulkan driver
to run our SPIR-V binaries through, there's really no point.
2016-03-25 21:35:41 -07:00
Jason Ekstrand 13bad493b4 nir/algebraic: Get rid of a redundant copy of fdiv lowering 2016-03-25 14:04:05 -07:00
Jason Ekstrand 08fe89864b nir/algebraic: Add better lowering of ldexp 2016-03-25 14:04:05 -07:00
Jason Ekstrand b75d770963 nir/builder: Simplify nir_ssa_undef a bit 2016-03-25 14:04:05 -07:00
Jason Ekstrand ab31951bef nir/spirv: Use the nir_ssa_undef helper from nir_builder 2016-03-25 14:04:05 -07:00
Jason Ekstrand d2eee52a65 nir/builder: Add a bit size field to nir_ssa_undef 2016-03-25 14:04:05 -07:00
Jason Ekstrand b50f7f0011 nir: Add a better comment for INTRINSIC_RANGE 2016-03-25 14:04:05 -07:00
Jason Ekstrand add8c837b5 nir/glsl: Stop carying a pointer to the nir_shader in the visitor 2016-03-25 14:04:05 -07:00
Jason Ekstrand 2c3f95d6aa Merge remote-tracking branch 'public/master' into vulkan 2016-03-24 17:30:14 -07:00
Jason Ekstrand 22b343a8ec nir: Add a pass to inline functions
This commit adds a new NIR pass that lowers all function calls away by
inlining the functions.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand debf23ec68 nir/builder: Add helpers for easily inserting copy_var intrinsics
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand 79dec93ead nir: Add return lowering pass
This commit adds a NIR pass for lowering away returns in functions.  If the
return is in a loop, it is lowered to a break.  If it is not in a loop,
it's lowered away by moving/deleting code as needed.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand 8d61d72524 nir: Add a cursor helper for getting a cursor after any phi nodes
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand 18b0166749 nir/builder: Add a helper for inserting jump instructions
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand 97b663481c nir/cf: Make extracting or re-inserting nothing a no-op
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand 7022a673cd nir: Add a function for comparing cursors
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand 124f229ece nir/cf: Handle relinking top-level blocks
This can happen if a function ends in a return instruction and you remove
the return.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand 364212f1ed nir: Add a pass to repair SSA form
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand ea98d415e4 nir/vars_to_ssa: Use the new nir_phi_builder helper
The efficiency should be approximately the same.  We do a little more work
per phi node because we have to sort the predecessors.  However, we no
longer have to walk the blocks a second time to pop things off the stack.
The bigger advantage, however, is that we can now re-use the phi placement
and per-block SSA value tracking in other passes.

As a side-benifit, the phi builder actually handles unreachable blocks
correctly.  The original vars_to_ssa code, because of the way it iterated
the blocks and added phi sources, didn't add sources corresponding to
predecessors of unreachable blocks.  The new strategy employed by the phi
builder creates a phi source for each predecessor and should correctly
handle unreachable blocks by setting those sources to SSA undefs.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand 42ddfc611f nir/dominance: Handle unreachable blocks
Previously, nir_dominance.c didn't properly handle unreachable blocks.
This can happen if, for instance, you have something like this:

loop {
   if (...) {
      break;
   } else {
      break;
   }
}

In this case, the block right after the if statement will be unreachable.
This commit makes two changes to handle this.  First, it removes an assert
and allows block->imm_dom to be null if the block is unreachable.  Second,
it properly skips unreachable blocks in calc_dom_frontier_cb.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-03-24 15:20:44 -07:00
Jason Ekstrand e4dc82cfcf nir: Add a phi node placement helper
Right now, we have phi placement code in two places and there are other
places where it would be nice to be able to do this analysis.  Instead of
repeating it all over the place, this commit adds a helper for placing all
of the needed phi nodes for a value.

v2: Add better documentation

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-24 15:20:44 -07:00
Rob Clark 0bea0e7141 nir: fix dangling ssadef->name ptrs
In many places, the convention is to pass an existing ssadef name ptr
when construction/initializing a new nir_ssa_def.  But that goes badly
(as noticed by garbage in nir_print output) when the original string
gets freed.

Just use ralloc_strdup() instead, and add ralloc_free() in the two
places that would care (not that the strings wouldn't eventually get
freed anyways).

Also fixup the nir_search code which was directly setting ssadef->name
to use the parent instruction as memctx.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-24 08:30:04 -04:00
Jason Ekstrand a984e44abd nir/glsl: Propagate invariant into NIR alu ops
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:07 -07:00
Jason Ekstrand 91d6272c2b nir/alu_to_scalar: Propagate the "exact" bit
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:06 -07:00
Jason Ekstrand 5f39e3e165 nir/cse: Properly handle nir_ssa_def.exact
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:06 -07:00
Jason Ekstrand 0dbda153aa nir/algebraic: Flag inexact optimizations
Many of our optimizations, while great for cutting shaders down to size,
aren't really precision-safe.  This commit tries to flag all of the
inexact floating-point optimizations so they don't get run on values that
are flagged "exact".  It's a bit conservative and maybe flags some safe
optimizations as unsafe but that's better than missing one.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:02 -07:00
Jason Ekstrand ed3a029e80 nir/algebraic: Fix fmin detection to match the spec
The previous transformation got the arguments to fmin backwards.  When NaNs
are involved, the GLSL min/max aren't commutative so it matters.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:28:00 -07:00
Jason Ekstrand 89545b1314 nir/algebraic: Get rid of an invlid fxor optimization
The fxor opcode is required to return 1.0f or 0.0f but the input variable
may not be 1.0f or 0.0f.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:27:58 -07:00
Jason Ekstrand 3a7cb6534c nir/algebraic: Allow for flagging operations as being inexact
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:27:55 -07:00
Jason Ekstrand a6f25fa7d7 nir/search: Propagate exactness into newly created expressions
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:27:52 -07:00
Jason Ekstrand ded3133d47 nir/builder: Add a flag for setting exact
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:26:34 -07:00
Jason Ekstrand 4ff89377d9 nir: Add an "exact" bit to nir_alu_instr
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2016-03-23 16:26:34 -07:00
Jason Ekstrand f849f53990 nir/clone: Export nir_variable_clone
Reviewed-by: Rob Clark <robclark@gmail.com>
2016-03-23 15:26:11 -07:00
Jason Ekstrand 5fe8959912 nir/clone: Expose nir_constant_clone
Reviewed-by: Rob Clark <robclark@gmail.com>
2016-03-23 15:26:08 -07:00
Jason Ekstrand c4c373f156 nir: Fix whitespace
Reviewed-by: Rob Clark <robclark@gmail.com>
2016-03-23 15:25:53 -07:00
Ian Romanick d7a25a9def nir: Don't abs slt and friends
No shader-db changes, but this is symmetric with the previous commit.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:48:02 -07:00
Ian Romanick 2bb006af68 nir: Don't abs the result of b2f or b2i
In the results below, 2 SIMD16 shaders in Trine are lost.

G4X
total instructions in shared programs: 4012279 -> 4011108 (-0.03%)
instructions in affected programs: 116776 -> 115605 (-1.00%)
helped: 339
HURT: 0

total cycles in shared programs: 84315862 -> 84313584 (-0.00%)
cycles in affected programs: 1767232 -> 1764954 (-0.13%)
helped: 274
HURT: 81

Ironlake
total instructions in shared programs: 6399073 -> 6396998 (-0.03%)
instructions in affected programs: 218050 -> 215975 (-0.95%)
helped: 600
HURT: 0

total cycles in shared programs: 128892088 -> 128888810 (-0.00%)
cycles in affected programs: 2867452 -> 2864174 (-0.11%)
helped: 422
HURT: 137

Sandy Bridge
total instructions in shared programs: 8462174 -> 8460759 (-0.02%)
instructions in affected programs: 178529 -> 177114 (-0.79%)
helped: 596
HURT: 0

total cycles in shared programs: 117542276 -> 117534098 (-0.01%)
cycles in affected programs: 1239166 -> 1230988 (-0.66%)
helped: 369
HURT: 150

Ivy Bridge
total instructions in shared programs: 7775131 -> 7773410 (-0.02%)
instructions in affected programs: 162903 -> 161182 (-1.06%)
helped: 590
HURT: 0

total cycles in shared programs: 65759882 -> 65747268 (-0.02%)
cycles in affected programs: 1004354 -> 991740 (-1.26%)
helped: 467
HURT: 141

Haswell
total instructions in shared programs: 7107786 -> 7106327 (-0.02%)
instructions in affected programs: 140954 -> 139495 (-1.04%)
helped: 590
HURT: 0

total cycles in shared programs: 64668028 -> 64655322 (-0.02%)
cycles in affected programs: 967080 -> 954374 (-1.31%)
helped: 452
HURT: 149

LOST:   2
GAINED: 0

Broadwell
total instructions in shared programs: 8980029 -> 8978287 (-0.02%)
instructions in affected programs: 197232 -> 195490 (-0.88%)
helped: 715
HURT: 0

total cycles in shared programs: 70070448 -> 70055970 (-0.02%)
cycles in affected programs: 975724 -> 961246 (-1.48%)
helped: 471
HURT: 111

LOST:   2
GAINED: 0

Skylake
total instructions in shared programs: 9115178 -> 9113436 (-0.02%)
instructions in affected programs: 203012 -> 201270 (-0.86%)
helped: 715
HURT: 0

total cycles in shared programs: 68848660 -> 68834004 (-0.02%)
cycles in affected programs: 993888 -> 979232 (-1.47%)
helped: 473
HURT: 116

LOST:   2
GAINED: 0

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:48:02 -07:00
Ian Romanick 348e5a71d8 nir: Simplify 0 < fabs(a)
Sandy Bridge / Ivy Bridge / Haswell
total instructions in shared programs: 8462180 -> 8462174 (-0.00%)
instructions in affected programs: 564 -> 558 (-1.06%)
helped: 6
HURT: 0

total cycles in shared programs: 117542462 -> 117542276 (-0.00%)
cycles in affected programs: 9768 -> 9582 (-1.90%)
helped: 12
HURT: 0

Broadwell / Skylake
total instructions in shared programs: 8980833 -> 8980826 (-0.00%)
instructions in affected programs: 626 -> 619 (-1.12%)
helped: 7
HURT: 0

total cycles in shared programs: 70077900 -> 70077714 (-0.00%)
cycles in affected programs: 9378 -> 9192 (-1.98%)
helped: 12
HURT: 0

G45 and Ironlake showed no change.

v2: Modify the comments to look more like a proof.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:47:56 -07:00
Ian Romanick 564a8b8a26 nir: Simplify 0 >= b2f(a)
This also prevented some regressions with other patches in my local
tree.

Broadwell / Skylake
total instructions in shared programs: 8980835 -> 8980833 (-0.00%)
instructions in affected programs: 45 -> 43 (-4.44%)
helped: 1
HURT: 0

total cycles in shared programs: 70077904 -> 70077900 (-0.00%)
cycles in affected programs: 122 -> 118 (-3.28%)
helped: 1
HURT: 0

No changes on earlier platforms.

v2: Modify the comments to look more like a proof.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:44:57 -07:00
Ian Romanick bf0d60aa11 nir: Simplify i2b with negated or abs operand
This enables removing ssa_201 and ssa_202 in sequences like:

                 vec1 ssa_200 = flt ssa_199, ssa_194
                 vec1 ssa_201 = b2i ssa_200
                 vec1 ssa_202 = i2b -ssa_201

shader-db results:

Sandy Bridge
total instructions in shared programs: 8462257 -> 8462180 (-0.00%)
instructions in affected programs: 3846 -> 3769 (-2.00%)
helped: 35
HURT: 0

total cycles in shared programs: 117542934 -> 117542462 (-0.00%)
cycles in affected programs: 20072 -> 19600 (-2.35%)
helped: 20
HURT: 1

Ivy Bridge
total instructions in shared programs: 7775252 -> 7775137 (-0.00%)
instructions in affected programs: 3645 -> 3530 (-3.16%)
helped: 35
HURT: 0

total cycles in shared programs: 65760522 -> 65760068 (-0.00%)
cycles in affected programs: 21082 -> 20628 (-2.15%)
helped: 25
HURT: 2

Haswell
total instructions in shared programs: 7108666 -> 7108589 (-0.00%)
instructions in affected programs: 3253 -> 3176 (-2.37%)
helped: 35
HURT: 0

total cycles in shared programs: 64675726 -> 64675272 (-0.00%)
cycles in affected programs: 21034 -> 20580 (-2.16%)
helped: 26
HURT: 1

Broadwell / Skylake
total instructions in shared programs: 8980912 -> 8980835 (-0.00%)
instructions in affected programs: 3223 -> 3146 (-2.39%)
helped: 35
HURT: 0

total cycles in shared programs: 70077926 -> 70077904 (-0.00%)
cycles in affected programs: 21886 -> 21864 (-0.10%)
helped: 21
HURT: 6

G45 and Ironlake showed no change.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:43:28 -07:00
Ian Romanick a4079f1cb2 nir: Lower flrp with Boolean interpolator to bcsel
On Intel platforms that don't set lower_flrp, using bcsel instead of
flrp seems to be a small amount worse.  On those platforms, the use of
flrp, bcsel, and multiply of b2f is still an active area of research.
In review, Matt suggested this is because bcsel turns into CMP+SEL, and
because of the flag register we can't schedule instructions well.

shader-db results:

G4X / Ironlake
total instructions in shared programs: 4016538 -> 4012279 (-0.11%)
instructions in affected programs: 161556 -> 157297 (-2.64%)
helped: 1077
HURT: 1

total cycles in shared programs: 84328296 -> 84315862 (-0.01%)
cycles in affected programs: 4174570 -> 4162136 (-0.30%)
helped: 926
HURT: 53

Unsurprisingly, no changes on later platforms.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-03-22 14:42:42 -07:00
Connor Abbott 58fe7837b8 nir: propagate bitsize information in nir_search
When we replace an expresion we have to compute bitsize information for the
replacement. We do this in two passes to validate that bitsize information
is consistent and correct: first we propagate bitsize from child nodes to
parent, then we do it the other way around, starting from the original's
instruction destination bitsize.

v2 (Iago):
- Always use nir_type_bool32 instead of nir_type_bool when generating
  algebraic optimizations. Before we used nir_type_bool32 with constants
  and nir_type_bool with variables.
- Fix bool comparisons in nir_search.c to account for bitsized types.

v3 (Sam):
- Unpack the double constant value as unsigned long long (8 bytes) in
nir_algrebraic.py.

v4 (Sam):
- Use helpers to get type size and base type from nir_alu_type.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:54:45 +01:00
Connor Abbott 3124ce699b nir: add a bit_size parameter to nir_ssa_dest_init
v2: Squash multiple commits addressing the new parameter in different
    files so we don't break the build (Iago)

v3: Fix tgsi (Samuel)

v4: Fix nir_clone.c (Samuel)

v5: Fix vc4 and freedreno (Iago)

v6 (Sam)
- Fix build errors in nir_lower_indirect_derefs
- Use helper to get type size from nir_alu_type.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsalvez <siglesias@igalia.com>
Tested-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:54:45 +01:00
Iago Toral Quiroga 084b24f558 nir: rename nir_const_value fields to include bitsize information
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-03-17 11:16:33 +01:00
Connor Abbott 9076c4e289 nir: update opcode definitions for different bit sizes
Some opcodes need explicit bitsizes, and sometimes we need to use the
double version when constant folding.

v2: fix output type for u2f (Iago)

v3: do not change vecN opcodes to be float. The next commit will add
    infrastructure to enable 64-bit integer constant folding so this is isn't
    really necessary. Also, that created problems with source modifiers in
    some cases (Iago)

v4 (Jason):
  - do not change bcsel to work in terms of floats
  - leave ldexp generic

Squashed changes to handle different bit sizes when constant
folding since otherwise we would break the build.

v2:
- Use the bit-size information from the opcode information if defined (Iago)
- Use helpers to get type size and base type of nir_alu_type enum (Sam)
- Do not fallback to sized types to guess bit-size information. (Jason)

Squashed changes in i965 and gallium/nir drivers to support sized types.
These functions should only see sized types, but we can't make that change
until we make sure that nir uses the sized versions in all the relevant places.
A later commit will address this.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:16:33 +01:00
Connor Abbott 6700d7e423 nir: add nir_{src,dest}_bit_size() helpers
v2: use a ternary (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:16:33 +01:00
Jason Ekstrand e172dbe5d2 nir: Add a bit_size to nir_register and nir_ssa_def
This really hacky commit adds a bit size to registers and SSA values.  It
also adds rules in the validator to validate that they do the right things.

It's still an open question as to whether or not we want a bit_size in
nir_alu_instr or if we just want to let it inherit from the destination.
I'm inclined to just let it inherit from the destination.  A similar
question needs to be asked about intrinsics.

v2 (Connor):
  - Relax validation: comparisons have explicit destination sizes
    and implicit source sizes.

v3 (Sam):
- Use helpers to get size and base types of nir_alu_type enum.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:16:33 +01:00
Jason Ekstrand 78f1919429 nir: Add explicitly sized types
v2: Fix size/type mask to properly handle 8-bit types.

v3: Add helpers to get the bitsize and base type of a
nir_alu_type enum.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:16:33 +01:00
Jordan Justen 3fd308a357 Merge remote-tracking branch 'origin/master' into vulkan 2016-03-17 01:44:07 -07:00
Jordan Justen b1e7cdfdcf nir: Lower shared var atomics during nir_lower_io
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-17 01:23:40 -07:00
Jordan Justen e3cbb9d37c nir: Add support for lowering load/stores of shared variables
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-17 01:23:40 -07:00
Jordan Justen 683c359c54 nir: Add atomic operations on variables
This allows us to first generate atomic operations for shared
variables using these opcodes, and then later we can lower those to
the shared atomics intrinsics with nir_lower_io.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-17 01:23:40 -07:00
Jordan Justen 3c807607df nir: Add compute shader shared variable storage class
Previously we were receiving shared variable accesses via a lowered
intrinsic function from glsl. This change allows us to send in
variables instead. For example, when converting from SPIR-V.

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-17 01:23:40 -07:00
Jordan Justen 26f8262698 nir/print: Add space after shader_storage var mode
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-03-17 01:23:40 -07:00
Jason Ekstrand 7f6a0cb29c Merge remote-tracking branch 'public/master' into vulkan 2016-03-15 14:09:50 -07:00
Jason Ekstrand 98d58e7320 nir/clone: Add support for cloning a single function_impl
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand 036b209484 nir/validate: Better function validation
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand f86f3c90aa nir/print: Better function argument printing
Since we aren't going to put the function parameters or the return variable
in the list of locals, it won't get a proper declaration.  This changes
nir_print to print the type along with each parameter or return variable.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand 13969565f9 nir/print: Factor variable name lookup into a helper
Otherwise, we have a problem when we go to print functions with arguments
because their names get added to the hash table during declaration which
happens after we print the prototype.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand e4bebe8a02 nir: Create function parameters in function_impl_create
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand 066d3c115e nir: Add a helper for creating a "bare" nir_function_impl
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand 2ef4754a20 nir: Add a new "param" variable mode for parameters and return variables
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand 41ae553fda nir/glsl: Remove dead function parameter handling code
NIR has never been used on IR where we haven't already done function
inlining so this code has been dead from the beginning.  Let's just get rid
of it for now.  We can always put it back in if we decide to use NIR for
function inlining at some point in the future.

Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2016-03-12 15:48:36 -08:00
Jason Ekstrand 14b18aba89 nir: Add a pass for lower indirect variable dereferences
This new pass lowers load/store_var intrinsics that act on indirect derefs
to if-ladder of direct load/store_var intrinsics.  The if-ladders perform a
simple binary search on the indirect.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2016-03-08 10:41:54 -08:00
Matt Turner 905ff86198 nir: Recognize open-coded extract_u16.
No shader-db changes, but does recognize some extract_u16 which enables
the next patch to optimize some code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-04 11:52:34 -08:00
Matt Turner 76289fbfa8 nir: Recognize open-coded extract_u8.
Two shaders that appear in Unigine benchmarks (Heaven and Valley) unpack
three bytes from an integer and convert each into a float:

   float((val >> 16u) & 0xffu)
   float((val >>  8u) & 0xffu)
   float((val >>  0u) & 0xffu)

Instead of shifting, masking, and type converting like this:

   shr(8)          g15<1>UD        g25<8,8,1>UD    0x00000010UD
   and(8)          g16<1>UD        g15<8,8,1>UD    0x000000ffUD
   mov(8)          g17<1>F         g16<8,8,1>UD

   shr(8)          g18<1>UD        g25<8,8,1>UD    0x00000008UD
   and(8)          g19<1>UD        g18<8,8,1>UD    0x000000ffUD
   mov(8)          g20<1>F         g19<8,8,1>UD

   and(8)          g21<1>UD        g25<8,8,1>UD    0x000000ffUD
   mov(8)          g22<1>F         g21<8,8,1>UD

i965 can simply extract a byte and convert to float in a single
instruction:

   mov(8)          g17<1>F         g25.2<32,8,4>UB
   mov(8)          g20<1>F         g25.1<32,8,4>UB
   mov(8)          g22<1>F         g25.0<32,8,4>UB

This patch implements the first step: recognizing byte extraction. A
later patch will optimize out the conversion to float.

   instructions in affected programs: 28568 -> 27450 (-3.91%)
   helped: 7

   cycles in affected programs: 210076 -> 203144 (-3.30%)
   helped: 7

This patch decreases the number of instructions in the two Unigine
programs by:

 #1721: 4520 -> 4374 instructions (-3.23%)
 #1706: 3752 -> 3582 instructions (-4.53%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-04 11:52:34 -08:00
Kristian Høgsberg Kristensen b00b42d99b nir/spirv: Use the new bare sampler type 2016-02-28 11:24:05 -08:00
Jason Ekstrand c9564fd598 nir/spirv: Allow but warn for a few capabilities
Unfortunately, glslang gives us cull/clip distance and GS streams even if
the shader doesn't use it whenever a shader is declared as version 450.
This is a glslang bug, but we can easily enough ignore it for now.
2016-02-23 22:07:25 -08:00
Jason Ekstrand 040355b688 nir/spirv: Add more capabilities 2016-02-23 21:01:00 -08:00
Jason Ekstrand f49ba0f7d8 nir/spirv: Add support for multisampled textures 2016-02-21 22:02:38 -08:00
Jason Ekstrand 79c0781f44 nir/gather_info: Count textures and images 2016-02-18 11:42:36 -08:00
Jason Ekstrand 581e4468f9 nir/spirv: Add some more capabilities 2016-02-17 18:04:39 -08:00
Jason Ekstrand 979732fafc nir: Add a helper for getting the one function from a shader 2016-02-17 18:04:39 -08:00
Jason Ekstrand 8c05b44bbb nir: Add a nir_foreach_variable_safe helper 2016-02-17 18:04:39 -08:00
Kristian Høgsberg Kristensen b8da261dc7 spirv: Fix SpvOpFwidth, SpvOpFwidthFine and SpvOpFwidthCoarse
"Result is the same as computing the sum of the absolute values of
    OpDPdx and OpDPdy on P."

We were doing sum of absolute values of OpDPdx of P and OpDPdx of NULL.
2016-02-17 15:28:52 -08:00
Jason Ekstrand 88042b9f10 nir: Get rid of the C++ NIR_SRC/DEST_INIT macros
These were originally added to reduce compiler warnings but aren't really
needed.  Getting rid of them reduces the diff between the Vulkan branch and
master, so we might as well.
2016-02-12 21:35:02 -08:00
Jason Ekstrand 3c8dc1afd1 nir/spirv/glsl: Clean up the row-skipping swizzle logic a bit 2016-02-12 10:40:39 -08:00
Jason Ekstrand 4016619931 nir/spirv: Allow the clip distance capability. 2016-02-11 15:14:46 -08:00
Jason Ekstrand f710f3ca37 Merge remote-tracking branch 'mesa-public/master' into vulkan
This also reverts commit 1d65abfa58 because
now NIR handles texture offsets in a much more sane way.
2016-02-10 17:12:11 -08:00
Jason Ekstrand 8750299a42 nir: Remove the const_offset from nir_tex_instr
When NIR was originally drafted, there was no easy way to determine if
something was constant or not.  The result was that we had lots of
special-casing for constant values such as this.  Now that load_const
instructions are SSA-only, it's really easy to find constants and this
isn't really needed anymore.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Rob Clark <robclark@gmail.com>
2016-02-10 16:33:50 -08:00
Jason Ekstrand 70dff4a55e nir/lower_vec_to_movs: Better report channels handled by insert_mov
This fixes two issues.  First, we had a use-after-free in the case where
the instruction got deleted and we tried to return mov->dest.write_mask.
Second, in the case where we are doing a self-mov of a register, we delete
those channels that are moved to themselves from the write-mask.  This
means that those channels aren't reported as being handled even though they
are.  We now stash off the write-mask before remove unneeded channels so
that they still get reported as handled.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94073
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "11.0 11.1" <mesa-stable@lists.freedesktop.org>
2016-02-10 16:33:14 -08:00
Jason Ekstrand 9be5a4bc29 nir/spirv: Fix handling of OpGroupMemberDecorate
We were pulling the member index from the wrong dword
2016-02-10 15:36:42 -08:00
Jason Ekstrand ac04c6de2c nir/spirv: Assert that struct member ids are in-bounds 2016-02-10 15:36:41 -08:00
Mark Janes 8179834030 nir/spirv: fix build_mat_subdet stack smasher
The sub-determinate implementation pattern fixed by
6a7e2904e0 has a second instance in the
same file.

With the previous algorithm, when row and j are both 3, the index
overruns the array.  This only impacts the stack on 32 bit builds.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-02-10 14:43:03 -08:00
Jason Ekstrand 09b3e30dc6 anv: Fix up spirv for new texture/sampler split stuff 2016-02-09 16:48:36 -08:00
Jason Ekstrand b14f4c1fd3 Merge remote-tracking branch 'mesa-public/master' into vulkan
This pulls in the separate texture/sampler stuff from upstream
2016-02-09 16:47:37 -08:00
Jason Ekstrand e01dd59b73 vtn: Use const_index helpers 2016-02-09 16:32:38 -08:00
Jason Ekstrand 768bd7f272 Merge commit '8b0fb1c152fe191768953aa8c77b89034a377f83' into vulkan
This pulls in Rob Clark's const_index changes for NIR
2016-02-09 15:30:39 -08:00
Jason Ekstrand 5ec456375e nir: Separate texture from sampler in nir_tex_instr
This commit adds the capability to NIR to support separate textures and
samplers.  As it currently stands, glsl_to_nir only sets the texture deref
and leaves the sampler deref alone as it did before and nir_lower_samplers
assumes this.  Backends can still assume that they are combined and only
look at only at the texture index.  Or, if they wish, they can assume that
they are separate because nir_lower_samplers, tgsi_to_nir, and prog_to_nir
all set both texture and sampler index whenever a sampler is required (the
two indices are the same in this case).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-09 15:00:17 -08:00
Jason Ekstrand ee85014b90 nir/tex_instr: Rename sampler to texture
We're about to separate the two concepts.  When we do, the sampler will
become optional.  Doing a rename first makes the separation a bit more
safe because drivers that depend on GLSL or TGSI behaviour will be fine to
just use the texture index all the time.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-09 15:00:17 -08:00
Jason Ekstrand 3f42184994 nir: Add some braces around loops and ifs 2016-02-09 15:00:17 -08:00
Rob Clark ced8d3e773 nir: use const_index helpers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-02-09 17:30:33 -05:00
Rob Clark b6cf98bc82 gtn: use const_index helpers
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-02-09 17:30:33 -05:00
Rob Clark 1df3ecc1b8 nir: const_index helpers
Direct access to intr->const_index[n], where different slots have
different meanings, is somewhat confusing.

Instead, let's put some extra info in nir_intrinsic_infos[] about which
slots map to what, and add some get/set helpers.  The helpers validate
that the field being accessed (base/writemask/etc) is applicable for the
intrinsic opc, for some extra safety.  And nir_print can use this to
dump out decoded const_index fields.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-02-09 17:30:33 -05:00
Jason Ekstrand 1d65abfa58 nir/spirv: Better handle constant offsets in texture lookups 2016-02-09 10:29:05 -08:00
Jason Ekstrand 209820739b nir/spirv: Set the vtn_mode and interface type for sampler parameters 2016-02-09 10:29:05 -08:00
Jason Ekstrand de6c9c5f2e nir/inline_functions: Don't shadown variables when it isn't needed
Previously, in order to get things working, we just always shadowed
variables.  Now, we rewrite derefs whenever it's safe to do so and only
shadow if we have an in or out variable that we write or read to
respectively.
2016-02-09 10:29:05 -08:00
Jason Ekstrand b6c00bfb03 nir: Rework function parameters 2016-02-09 10:29:05 -08:00
Timothy Arceri 1aae5e8ced nir: remove unused nir_variable fields
These are used in GLSL IR to removed unused varyings and match
transform feedback variables. There is no need to use these in NIR.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-02-09 22:49:06 +11:00
Matt Turner 371c4b3c48 nir: Recognize open-coded bitfield_reverse.
Helps 11 shaders in UnrealEngine4 demos.

I seriously hope they would have given us bitfieldReverse() if we
exposed GL 4.0 (but we do expose ARB_gpu_shader5, so why not use that
anyway?).

instructions in affected programs: 4875 -> 4633 (-4.96%)
cycles in affected programs: 270516 -> 244516 (-9.61%)

I suspect there's a *lot* of room to improve nir_search/opt_algebraic's
handling of this. We'd actually like to match, e.g., step2 by matching
step1 once and then doing a pointer comparison for the second instance
of step1, but unfortunately we generate an enormous tuple for instead.

The .text size increases by 6.5% and the .data by 17.5%.

   text     data  bss    dec    hex  filename
  22957    45224    0  68181  10a55  nir_libnir_la-nir_opt_algebraic.o
  24461    53160    0  77621  12f35  nir_libnir_la-nir_opt_algebraic.o

I'd be happy to remove this if Unreal4 uses bitfieldReverse() if it is
in a GL 4.0 context once we expose GL 4.0.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-02-08 21:20:58 -08:00
Matt Turner 2d0d9755da nir: Handle large unsigned values in opt_algebraic.
The next patch adds an algebraic rule that uses the constant 0xff00ff00.

Without this change, the build fails with

   return hex(struct.unpack('I', struct.pack('i', self.value))[0])
   struct.error: 'i' format requires -2147483648 <= number <= 2147483647

The hex() function handles integers of any size, and assigning a
negative value to an unsigned does what we want in C. The pack/unpack is
unnecessary (and as we see, buggy).

Reviewed-by: Dylan Baker <baker.dylan.c@gmail.com>
2016-02-08 20:38:17 -08:00
Matt Turner 7be8d07732 nir: Do opt_algebraic in reverse order.
Walking the SSA definitions in order means that we consider the smallest
algebraic optimizations before larger optimizations. So if a smaller
rule is part of a larger rule, the smaller one will happen first,
preventing the larger one from happening.

instructions in affected programs: 32721 -> 32611 (-0.34%)
helped: 106

In programs whose nir_optimize loop count changes (129 of them):

   before:  1164 optimization loops
   after:   1071 optimization loops

Of the 129 affected, 16 programs' optimization loop counts increased.

Prevents regressions and annoyances in the next commits.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-02-08 20:38:17 -08:00
Matt Turner a8f0960816 nir: Recognize product of open-coded pow()s.
Prevents regressions in the next commit.

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-02-08 20:38:17 -08:00
Matt Turner 9f02e3ab03 nir: Add opt_algebraic rules for xor with zero.
instructions in affected programs: 668 -> 664 (-0.60%)
helped: 4

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2016-02-08 20:38:17 -08:00
Francisco Jerez cec6fe2ad8 vtn: Clean up acos implementation.
Parameterize build_asin() on the fit coefficients so the
implementation can be shared while still using different polynomials
for asin and acos.  Also switch back to implementing acos in terms of
asin -- The improvement obtained from cancelling out the pi/2 terms
was negligible compared to the approximation error.
2016-02-08 15:23:43 -08:00
Francisco Jerez f50a651726 nir/spirv: Create integer types of correct signedness.
vtn_handle_type() creates a signed type regardless of the value of the
signedness flag, which usually doesn't make much of a difference
except when the type is used as base sampled type of an image type,
what will cause the base type of the NIR image variable to be
inconsistent with its format and cause an assertion failure in the
back-end (most likely only reproducible on Gen7), and may change the
semantics of the image intrinsic subtly (e.g. UMIN may become IMIN).
2016-02-08 15:23:35 -08:00
Jason Ekstrand 9401516113 Merge remote-tracking branch 'mesa-public/master' into vulkan 2016-02-05 15:21:11 -08:00
Jason Ekstrand 741744f691 Merge commit mesa-public/master into vulkan
This pulls in the patches that move all of the compiler stuff around
2016-02-05 15:03:44 -08:00
Matt Turner 955d052058 nir: Add lowering support for unpacking opcodes.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-01 10:43:57 -08:00
Matt Turner 9b8786eba9 nir: Add lowering support for packing opcodes.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-01 10:43:57 -08:00
Matt Turner 68f8c5730b nir: Add opcodes to extract bytes or words.
The uint versions zero extend while the int versions sign extend.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-01 10:43:57 -08:00
Matt Turner 8709dc0713 glsl: Remove 2x16 half-precision pack/unpack opcodes.
i965/fs was the only consumer, and we're now doing the lowering in NIR.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-01 10:43:57 -08:00
Matt Turner 9ce901058f nir: Add lowering of nir_op_unpack_half_2x16.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-01 10:43:57 -08:00
Matt Turner 140a886c41 nir: Make argument order of unop_convert match binop_convert.
Strangely the return and parameter types were reversed.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-01 10:43:57 -08:00
Emil Velikov eb63640c1d glsl: move to compiler/
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-26 16:08:33 +00:00
Emil Velikov a39a8fbbaa nir: move to compiler/
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-26 16:08:30 +00:00