Commit Graph

608 Commits

Author SHA1 Message Date
Rob Clark ae7aa8dbaf nir: fix (hopefully) windows build
Fixes: 53aa109b ("nir: add pass to lower atomic counters to SSBO")
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-08 13:41:16 -04:00
Jose Fonseca dab6a2dfd9 nir: Fix missing snprintf symbol on Windows.
Copy nir_print.c's snprintf definition for now, to unbreak Windows
builds.

We can and should cleanup all snprintf definitions in a follow up
change, but I rather not leave Windows build broken any further.

Trivial.
2017-05-07 19:23:07 +01:00
Rob Clark 53aa109ba2 nir: add pass to lower atomic counters to SSBO
This is equivalent to what mesa/st does in glsl_to_tgsi.  For most hw
there isn't a particularly good reason to treat these differently.

Signed-off-by: Rob Clark <robdclark@gmail.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
2017-05-04 13:48:06 -04:00
Johnson Lin a6fb943f3e nir/lower_tex: Fix minor error in YUV color conversion matrix
The matrix used for YCbCr to RGB is listed in:

    https://en.wikipedia.org/wiki/YCbCr

There was an error in converting the offsets from integers to unorm
values: 0.0625=16/256 should be 16.0/255,and 0.5=128.0/256 should be
128.0/255.  With this fix, the CSC result is bit aligned with wikipedia's
conversion result and FFMPeg's result.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100854
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-05-03 23:44:59 -07:00
Jason Ekstrand bb41d9a1d3 compiler: Add a system value and varying for ViewIndex
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-05-03 11:25:46 -07:00
Eric Anholt fba6559a1e nir: Pick just the channels we want for bitmap and drawpixels lowering.
NIR now validates that SSA references use the same number of channels as
are in the SSA value.

v2: Reword commit message, since the commit didn't land before the
    validation change did.

Fixes: 370d68babc ("nir/validate: Validate that bit sizes and components always match")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> (v1)
Cc: <mesa-stable@lists.freedesktop.org>
2017-05-02 10:24:40 -07:00
Timothy Arceri 7a7ee40c2d nir/i965: add before ffma algebraic opts
This shuffles constants down in the reverse of what the previous
patch does and applies some simpilifications that may be made
possible from doing so.

Shader-db results BDW:

total instructions in shared programs: 12980814 -> 12977822 (-0.02%)
instructions in affected programs: 281889 -> 278897 (-1.06%)
helped: 1231
HURT: 128

total cycles in shared programs: 246562852 -> 246567288 (0.00%)
cycles in affected programs: 11271524 -> 11275960 (0.04%)
helped: 1630
HURT: 1378

V2: mark float opts as inexact

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 12:08:14 +10:00
Timothy Arceri fb2269fed1 nir: shuffle constants to the top
V2: mark float opts as inexact

If one of the inputs to an mul/add is the result of another
mul/add there is a chance that we can reuse the result of that
mul/add in other calls if we do the multiplication in the right
order.

Also by attempting to move all constants to the top we increase
the chance of constant folding.

For example it is a fairly common pattern for shaders to do something
similar to this:

  const float a = 0.5;
  in vec4 b;
  in float c;

  ...

  b.x = b.x * c;
  b.y = b.y * c;

  ...

  b.x = b.x * a + a;
  b.y = b.y * a + a;

So by simply detecting that constant a is part of the multiplication
in ffma and switching it with previous fmul that updates b we end up
with:

  ...

  c = a * c;

  ...

  b.x = b.x * c + a;
  b.y = b.y * c + a;

Shader-db results BDW:

total instructions in shared programs: 13011050 -> 12967888 (-0.33%)
instructions in affected programs: 4118366 -> 4075204 (-1.05%)
helped: 17739
HURT: 1343

total cycles in shared programs: 246717952 -> 246410716 (-0.12%)
cycles in affected programs: 166870802 -> 166563566 (-0.18%)
helped: 18493
HURT: 7965

total spills in shared programs: 14937 -> 14560 (-2.52%)
spills in affected programs: 9331 -> 8954 (-4.04%)
helped: 284
HURT: 33

total fills in shared programs: 20211 -> 19671 (-2.67%)
fills in affected programs: 12586 -> 12046 (-4.29%)
helped: 286
HURT: 33

LOST:   39
GAINED: 33

Some of the hurt will go away when we shuffle things back down to the
bottom in the following patch. It's also noteworthy that almost all of the
spill changes are in Deus Ex both hurt and helped.

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 12:08:14 +10:00
Timothy Arceri 83f7fdf83a nir: add flt comparision simplification
Didn't turn out as useful as I'd hoped, but it will help alot more on
i965 by reducing regressions when we drop brw_do_channel_expressions()
and brw_do_vector_splitting().

I'm not sure how much sense 'is_not_used_by_conditional' makes on
platforms other than i965 but since this is a new opt it at least
won't do any harm.

shader-db BDW:

total instructions in shared programs: 13029581 -> 13029415 (-0.00%)
instructions in affected programs: 15268 -> 15102 (-1.09%)
helped: 86
HURT: 0

total cycles in shared programs: 247038346 -> 247036198 (-0.00%)
cycles in affected programs: 692634 -> 690486 (-0.31%)
helped: 183
HURT: 27

Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-04-24 12:08:14 +10:00
Jason Ekstrand 4cf079f7f2 nir: Add GLSL_TYPE_[U]INT64 to some switch statements
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-04-16 20:14:42 -07:00
Boyan Ding ff29f488d4 nir: Destination component count of shader_clock intrinsic is 2
This fixes the following error when using ARB_shader_clock on i965:
	vec1 32 ssa_0 = intrinsic shader_clock () () ()
	intrinsic store_var (ssa_0) (clock_retval) (3) /* wrmask=xy */
error: src->ssa->num_components == num_components (nir/nir_validate.c:204)

Signed-off-by: Boyan Ding <boyan.j.ding@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2017-04-14 14:54:06 -07:00
Rob Clark 9fc3e7137a nir/print: add compute shader info
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
2017-04-14 12:46:12 -04:00
Jason Ekstrand fbcf92a278 nir: Add support for 8 and 16-bit types
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-03-30 11:34:45 -07:00
Jason Ekstrand 28e41506a6 nir/constant_expressions: Don't switch on bit size when not needed
For opcodes such as the nir_op_pack_64_2x32 for which all sources and
destinations have explicit sizes, the bit_size parameter to the evaluate
function is pointless and *should* do nothing.  Previously, we were
always switching on the bit_size and asserting if it isn't one of the
sizes in the list.  This generates way more code than needed and is a
bit cruel because it doesn't let us have a bit_size of zero on an ALU op
which shouldn't need a bit_size.

Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-03-30 11:34:45 -07:00
Jason Ekstrand b69b44d222 nir/constant_expressions: Pull the guts out into a helper block
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-03-30 11:34:45 -07:00
Iago Toral Quiroga 023ea3772d nir/lower_wpos_center: support adding sample position to fragment coordinate
According to section 14.6 of the Vulkan specification:

   "When sample shading is enabled, the x and y components of FragCoord
    reflect the location of the sample corresponding to the shader
    invocation."

So add a boolean parameter to the lowering pass to select this behavior
when we need it.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-24 08:11:53 +01:00
Matt Turner ef71af7356 nir: Return progress from nir_convert_from_ssa().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner abc8a702d0 nir: Return progress from nir_lower_io().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner a934b00222 nir: Return progress from nir_lower_regs_to_ssa().
And from nir_lower_regs_to_ssa_impl() as well.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner b0e72defc2 nir: Return progress from nir_lower_samplers().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner 01548f9f01 nir: Return progress from nir_lower_atomics().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner 0bd615d961 nir: Return progress from nir_lower_clamp_color_outputs().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner 9dbf91f5c0 nir: Return progress from nir_lower_clip_fs().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner 4e4927cd95 nir: Return progress from nir_lower_clip_vs().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:44 -07:00
Matt Turner 6077cc75aa nir: Return progress from nir_move_vec_src_uses_to_dest().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner a539e05d00 nir: Return progress from nir_lower_to_source_mods().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner 5a7e4ae23d nir: Return progress from nir_lower_clip_cull_distance_arrays().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner 19345fc160 nir: Return progress from nir_lower_var_copies().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner b831b8d2e1 nir: Return progress from nir_lower_load_const_to_scalar().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner adb157ddfd nir: Return progress from nir_lower_64bit_pack().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner 0012a6144a nir: Return progress from nir_lower_doubles().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner c597f87739 nir: Return progress from nir_lower_vars_to_ssa().
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner 7d41bf8d7b nir: Fix syntax.
et is not an abbreviation.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner 70c0455974 nir: Fix misspellings.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Matt Turner d6e2bdfed3 nir: Stop using apostrophes to pluralize.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-03-23 14:34:43 -07:00
Emil Velikov e3de145fa2 nir: consistently use ifndef guards over pragma once
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-03-22 16:55:22 +00:00
Vinson Lee 1fa432741c nir: Add positional argument specifiers.
Fix build with Python < 2.7.

  File "src/compiler/nir/nir_builder_opcodes_h.py", line 46, in <module>
    from nir_opcodes import opcodes
  File "src/compiler/nir/nir_opcodes.py", line 178, in <module>
    unop_convert("{}2{}{}".format(src_t[0], dst_t[0], bit_size),
ValueError: zero length field name in format

Fixes: 762a6333f2 ("nir: Rework conversion opcodes")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2017-03-21 13:38:00 -07:00
Jason Ekstrand 9d559ba39d nir/constant_expressions: Refactor helper functions
Apart from avoiding some unneeded size cases, this shouldn't have any
actual functional impact.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand 762a6333f2 nir: Rework conversion opcodes
The NIR story on conversion opcodes is a mess.  We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random.  This commit re-organizes things and makes them all
consistent:

 - All non-bool conversion opcodes now have the explicit size in the
   destination and are named <src_type>2<dst_type><size>.

 - Integer <-> integer conversion opcodes now only come in i2i and u2u
   forms (i2u and u2i have been removed) since the only difference
   between the different integer conversions is whether or not they
   sign-extend when up-converting.

 - Boolean conversion opcodes all have the explicit size on the bool and
   are named <src_type>2<dst_type>.

Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako.  This will make adding
int8, int16, and float16 versions much easier when the time comes.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand 6eb051e36f nir: Rewrite nir_type_conversion_op
The original version was very convoluted and tried way too hard to not
just have the nested switch statement that it needs.  Let's just write
the obvious code and then we know it's correct.  This fixes a bunch of
missing cases particularly with int64.

Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand 9084b1db30 nir: Add a get_nir_type_for_glsl_base_type helper
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand a136884139 nir/validate: Rework ALU bit-size rule validation
The original bit-size validation wasn't capable of properly dealing with
instructions with variable bit sizes.  An attempt was made to handle it
by looking at source and destinations but, because the validation was
done in validate_alu_(src|dest), it didn't really have the needed
information.  The new validation code is much more straightforward and
should be more correct.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Jason Ekstrand 370d68babc nir/validate: Validate that bit sizes and components always match
We've always required bit sizes to match but the rules for number of
components have been a bit loose.  You've never been allowed to source
from something with less components than you consume, but more has
always been fine.  This changes the validator to require that they match
exactly.  The fact that they don't always match has been a source of
confusion in NIR for quite some time and it's time we got rid of it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand e9a45a3d5d nir: Make image_size a variable-width intrinsic
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:40 -07:00
Jason Ekstrand 0bf0365393 nir/lower_tex: Use tex_instr_dest_size for txs destinations
Using coord_components of the source texture is correct for everything
except cube maps where it's off by one.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand 3c312be7b3 nir/copy_prop: Respect the source's number of components
In the near future we are going to require that the num_components in a
src dereference match the num_components of the SSA value being
dereferenced.  To do that, we need copy_prop to not remove our MOVs from
a larger SSA value into an instruction that uses fewer channels.

Because we suddenly have to know how many components each source has,
this makes the pass a bit more complicated.  Fortunately, copy
propagation is the only pass that cares about the number of components
are read by any given source so it's fairly contained.

Shader-db results on Sky Lake:

   total instructions in shared programs: 13318947 -> 13320265 (0.01%)
   instructions in affected programs: 260633 -> 261951 (0.51%)
   helped: 324
   HURT: 1027

Looking through the hurt programs, about a dozen are hurt by 3
instructions and the rest are all hurt by 2 instructions.  From a
spot-check of the shaders, the story is always the same:  They get a
vec4 from somewhere (frequently an input) and use the first two or three
components as a texture coordinate.  Because of the vector component
mismatch, we have a mov or, more likely, a vecN sitting between the
texture instruction and the input.  This means that the back-end inserts
a bunch of MOVs and split_virtual_grfs() goes to town.  Because the
texture coordinate is also used by some other calculation, register
coalesce can't combine them back together and we end up with an extra 2
MOV instructions in our shader.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-03-14 07:36:20 -07:00
Jason Ekstrand 60d1aac28a nir/intrinsics: Make load_barycentric_input take a 2-component coor
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>
2017-03-14 07:36:20 -07:00
Emil Velikov e4c7911150 nir: remove shebang from python scripts
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Jason Ekstrand bc456749bd nir/int64: Properly handle imod/irem
The previous implementation was fine for GLSL which doesn't really have
a signed modulus/remainder.  They just leave the behavior undefined
whenever either source is negative.  However, in SPIR-V, there is a
defined behavior for negative arguments.  This commit beefs up the pass
so that it handles both correctly.  Tested using a hacked up version of
the Vulkan CTS test to get 64-bit support.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-03 13:59:27 -08:00
Jason Ekstrand 9745bef308 nir/builder: Add an int64 immediate helper
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-03-03 13:59:24 -08:00