Commit Graph

148 Commits

Author SHA1 Message Date
Karol Herbst e5899c1e88 nir: rename nir_op_fne to nir_op_fneu
It was always fneu but naming it fne causes confusion from time to time. So
lets rename it. Later we also want to add other unordered and fne, this is
a smaller preparation for that.

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6377>
2020-08-21 17:26:21 +00:00
Rhys Perry 27ec38d746 nir: fix potential left shift of a negative value
Fixes UBSan error:
src/compiler/nir/nir_constant_expressions.c:36573:32: runtime error: left shift of negative value -1

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6206>
2020-08-20 10:52:19 +00:00
Jesse Natalie af59e4c400 nir: Add fisfinite op
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6355>
2020-08-17 15:34:08 -07:00
Jesse Natalie 9ebbed6ddc nir: Add fisnormal op
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6355>
2020-08-17 15:34:00 -07:00
Jesse Natalie 456edf0b30 nir: Support 8 and 16 component vectors for reduceable intrinsics
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6030>
2020-07-23 18:23:20 -07:00
Rhys Perry 9a389322c4 nir: slight correction to cube_face_coord constant folding
ACO does the division with a rcp and then a multiplication.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5547>
2020-06-22 10:28:40 +00:00
Marek Olšák f798513f91 nir: add i2imp and u2ump opcodes for conversions to mediump
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5002>
2020-06-02 20:01:18 +00:00
Alyssa Rosenzweig fcbc022787 nir: Add un/pack_32_4x8 opcodes
Complement the existing un/pack_32_2x16 opcodes. These are useful for
8-bit format packing. On Midgard, they are equivalent to just a 32-bit
move, but other GPUs could lower to other packs if needed.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5107>
2020-05-25 20:03:52 +00:00
Alyssa Rosenzweig c2b0f3c17d nir: Add fclamp_pos opcode
Corresponds to the .pos modifier on all Mali GPUs (lima and panfrost).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5102>
2020-05-19 20:21:27 +00:00
Alyssa Rosenzweig 0aedce417a nir: Add fsat_signed opcode
Exists on later Mali. Equivalent to clamp(x, -1.0, 1.0)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5102>
2020-05-19 20:21:27 +00:00
Rhys Perry abc4a82857 nir: make fsat return 0.0 with NaN instead of passing it through
This is how lower_fsat and ACO implements fsat and is a more useful
definition since it can be exactly created from fmin(fmax(a, 0.0), 1.0).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3716>
2020-05-07 10:39:19 +00:00
Gert Wollny 49ce749d0e nir: Add umad24 and umul24 opcodes
So far only the singed versions are defined.

v2: Make umad24 and umul24 non-driver specific (Eric Anholt)

v3: Take care of nir_builder and automatic lowering of the
    opcodes if they are not supported by the backend.

Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4610>
2020-04-23 18:23:04 +00:00
Jason Ekstrand 7c43b8ce1b nir: Delete the fnoise opcodes
As of the previous commit, they are never used.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4624>
2020-04-21 06:16:13 +00:00
Rob Clark bf64648864 nir: fix definition of imadsh_mix16 for vectors
Fixes: c27b3758fa ("nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodes")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4423>
2020-04-04 00:07:10 +00:00
Jason Ekstrand b2db84153a nir: Add b2b opcodes
These exist to convert between different types of boolean values.  In
particular, we want to use these for uniform and shared memory
operations where we need to convert to a reasonably sized boolean but we
don't care what its format is so we don't want to make the back-end
insert an actual i2b/b2i.  In the case of uniforms, Mesa can tweak the
format of the uniform boolean to whatever the driver wants.  In the case
of shared, every value in a shared variable comes from the shader so
it's already in the right boolean format.

The new boolean conversion opcodes get replaced with mov in
lower_bool_to_int/float32 so the back-end will hopefully never see them.
However, while we're in the middle of optimizing our NIR, they let us
have sensible load_uniform/ubo intrinsics and also have the bit size
conversion.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
2020-03-30 15:46:19 +00:00
Albert Astals Cid d988061172 cube_face_index: Use fabsf instead of fabs since we know it's floats
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3933>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3933>
2020-02-26 21:47:01 +00:00
Albert Astals Cid 6db7467b59 cube_face_coord: Use fabsf instead of fabs since we know it's floats
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3933>
2020-02-26 21:47:01 +00:00
Neil Roberts 125f867d3d nir/opcodes: Add nir_op_f2fmp
This opcode is the same as the f2f16 opcode except that it comes with
a promise that it is safe to optimise it out if the result is
immediately converted back to a 32-bit float again. Normally this
would be a lossy conversion and so it would be visible to the
application, but if the conversion is generated as part of the mediump
lowering process then this removal doesn’t matter. The opcode is
eventually replaced with a regular f2f16 in the late optimisations so
the backends don’t need to handle it.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3822>
2020-02-24 17:24:13 +00:00
Ian Romanick 1d97d186fb nir: Mark fmin and fmax as commutative and associative
Per the resolution of Khronos GLSL issue 80
(https://github.com/KhronosGroup/GLSL/issues/80).  Spec updates have not
landed yet, but I'll get to it soon. :)

The extra hurt shaders on Gen8+ are a handful of shaders that see things like

    bcsel(fmin(b - a, a - c) >= 0, x, y)

converted to

   bcsel(a >= b && c >= a, x, y)

The former can be generated as a CSEL instruction.  If either b - a or a
- c is used elsewhere in the shader, this saves an instruction.

All Haswell+ platforms had similar results. (Ice Lake shown)
total instructions in shared programs: 14550188 -> 14550048 (<.01%)
instructions in affected programs: 12168 -> 12028 (-1.15%)
helped: 30
HURT: 3
helped stats (abs) min: 1 max: 17 x̄: 4.77 x̃: 2
helped stats (rel) min: 0.05% max: 3.85% x̄: 1.77% x̃: 1.80%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.50% max: 0.50% x̄: 0.50% x̃: 0.50%
95% mean confidence interval for instructions value: -6.15 -2.33
95% mean confidence interval for instructions %-change: -2.00% -1.12%
Instructions are helped.

total cycles in shared programs: 203770286 -> 203771464 (<.01%)
cycles in affected programs: 688466 -> 689644 (0.17%)
helped: 172
HURT: 220
helped stats (abs) min: 1 max: 286 x̄: 12.15 x̃: 6
helped stats (rel) min: 0.03% max: 5.97% x̄: 0.70% x̃: 0.35%
HURT stats (abs)   min: 1 max: 578 x̄: 14.85 x̃: 6
HURT stats (rel)   min: 0.03% max: 32.36% x̄: 1.21% x̃: 0.52%
95% mean confidence interval for cycles value: -0.74 6.75
95% mean confidence interval for cycles %-change: 0.15% 0.59%
Inconclusive result (value mean confidence interval includes 0).

total fills in shared programs: 4525 -> 4523 (-0.04%)
fills in affected programs: 48 -> 46 (-4.17%)
helped: 1
HURT: 0

Ivy Bridge
total instructions in shared programs: 11858995 -> 11858898 (<.01%)
instructions in affected programs: 10822 -> 10725 (-0.90%)
helped: 25
HURT: 13
helped stats (abs) min: 1 max: 17 x̄: 5.32 x̃: 2
helped stats (rel) min: 0.40% max: 5.00% x̄: 2.16% x̃: 1.85%
HURT stats (abs)   min: 1 max: 15 x̄: 2.77 x̃: 2
HURT stats (rel)   min: 0.47% max: 2.90% x̄: 1.83% x̃: 2.15%
95% mean confidence interval for instructions value: -4.66 -0.45
95% mean confidence interval for instructions %-change: -1.54% -0.05%
Instructions are helped.

total cycles in shared programs: 177947023 -> 177946880 (<.01%)
cycles in affected programs: 822075 -> 821932 (-0.02%)
helped: 157
HURT: 175
helped stats (abs) min: 1 max: 164 x̄: 13.17 x̃: 4
helped stats (rel) min: 0.03% max: 6.72% x̄: 0.64% x̃: 0.17%
HURT stats (abs)   min: 1 max: 308 x̄: 11.00 x̃: 4
HURT stats (rel)   min: 0.03% max: 9.76% x̄: 0.70% x̃: 0.18%
95% mean confidence interval for cycles value: -3.86 3.00
95% mean confidence interval for cycles %-change: -0.09% 0.22%
Inconclusive result (value mean confidence interval includes 0).

total spills in shared programs: 4185 -> 4188 (0.07%)
spills in affected programs: 146 -> 149 (2.05%)
helped: 0
HURT: 1

total fills in shared programs: 5248 -> 5249 (0.02%)
fills in affected programs: 347 -> 348 (0.29%)
helped: 0
HURT: 1

Sandy Bridge
total instructions in shared programs: 10680224 -> 10680144 (<.01%)
instructions in affected programs: 4702 -> 4622 (-1.70%)
helped: 15
HURT: 3
helped stats (abs) min: 1 max: 17 x̄: 5.53 x̃: 5
helped stats (rel) min: 0.39% max: 4.76% x̄: 2.17% x̃: 1.67%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 0.52% max: 0.52% x̄: 0.52% x̃: 0.52%
95% mean confidence interval for instructions value: -7.24 -1.65
95% mean confidence interval for instructions %-change: -2.55% -0.89%
Instructions are helped.

total cycles in shared programs: 152988780 -> 152985691 (<.01%)
cycles in affected programs: 1072850 -> 1069761 (-0.29%)
helped: 168
HURT: 145
helped stats (abs) min: 1 max: 592 x̄: 33.90 x̃: 12
helped stats (rel) min: 0.02% max: 10.73% x̄: 0.90% x̃: 0.31%
HURT stats (abs)   min: 1 max: 259 x̄: 17.98 x̃: 6
HURT stats (rel)   min: 0.02% max: 8.17% x̄: 0.77% x̃: 0.19%
95% mean confidence interval for cycles value: -17.95 -1.79
95% mean confidence interval for cycles %-change: -0.34% 0.08%
Inconclusive result (%-change mean confidence interval includes 0).

Iron Lake and GM45 had similar results. (Iron Lake shown)
total instructions in shared programs: 8107033 -> 8107025 (<.01%)
instructions in affected programs: 696 -> 688 (-1.15%)
helped: 5
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.60 x̃: 2
helped stats (rel) min: 0.34% max: 7.14% x̄: 3.47% x̃: 4.65%
95% mean confidence interval for instructions value: -2.28 -0.92
95% mean confidence interval for instructions %-change: -7.22% 0.28%
Inconclusive result (%-change mean confidence interval includes 0).

total cycles in shared programs: 188348526 -> 188348404 (<.01%)
cycles in affected programs: 33618 -> 33496 (-0.36%)
helped: 23
HURT: 0
helped stats (abs) min: 2 max: 12 x̄: 5.30 x̃: 6
helped stats (rel) min: 0.05% max: 1.83% x̄: 0.47% x̃: 0.51%
95% mean confidence interval for cycles value: -6.70 -3.91
95% mean confidence interval for cycles %-change: -0.64% -0.30%
Cycles are helped.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/1359>
2020-02-10 18:37:36 -08:00
Ian Romanick 21f0d020fe nir: Add new instructions for INTEL_shader_integer_functions2
uctz isn't added because it will implemented in the GLSL path and the
SPIR-V path using other pre-existing instructions.

v2: Avoid signed integer overflow for uabs_isub(0, INT_MIN).  Noticed by
Caio.

v3: Alternate fix for signed integer overflow for abs_sub(0, INT_MIN).
I tried the previous methon in a small test program with -ftrapv, and it
failed.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/767>
2020-01-23 00:18:57 +00:00
Rob Clark a8ec4082a4 nir+vtn: vec8+vec16 support
This introduces new vec8 and vec16 instructions (which are the only
instructions taking more than 4 sources), in order to construct 8 and 16
component vectors.

In order to avoid fixing up the non-autogenerated nir_build_alu() sites
and making them pass 16 src args for the benefit of the two instructions
that take more than 4 srcs (ie vec8 and vec16), nir_build_alu() is has
nir_build_alu_tail() split out and re-used by nir_build_alu2() (which is
used for the > 4 src args case).

v2 (Karol Herbst):
  use nir_build_alu2 for vec8 and vec16
  use python's array multiplication syntax
  add nir_op_vec helper
  simplify nir_vec
  nir_build_alu_tail -> nir_builder_alu_instr_finish_and_insert
  use nir_build_alu for opcodes with <= 4 sources
v3 (Karol Herbst):
  fix nir_serialize
v4 (Dave Airlie):
  fix serialization of glsl_type
  handle vec8/16 in lowering of bools
v5 (Karol Herbst):
  fix load store vectorizer

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-12-21 11:00:17 +00:00
Neil Roberts 634eb9c04b nir: Add a 8-bit bool type
Adds nir_type_bool8 as well as 8-bit versions of all the bool
opcodes.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts 0f5640c577 nir: Add a 16-bit bool type
Adds nir_type_bool16 as well as 16-bit versions of all the bool
opcodes.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts 2ec97e78a9 nir/opcodes: Add a helper function to generate reduce opcodes
Adds binop_reduce_all_sizes which generates both 1-bit and 32-bit
versions of the reduce operation. This reduces the code duplication a
bit and will make it easier to later add 16-bit versions as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Neil Roberts 9a96afb97e nir/opcodes: Add a helper function to generate the comparison binops
Adds binop_compare_all_sizes which generates both 1-bit and 32-bit
versions of the comparison operation. This reduces the code
duplication a bit and will make it easier to later add 16-bit versions
as well.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-20 14:09:43 +01:00
Rob Clark 6320e37d4b nir: add amul instruction
Used for address/offset calculation (ie. array derefs), where we can
potentially use less than 32b for the multiply of array idx by element
size.  For backends that support `imul24`, this gives a lowering pass
an easy way to find multiplies that potentially can be converted to
`imul24`.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Rob Clark 0568761f8e nir: Add a new ALU nir_op_imul24
Some hardware can do 24b multiply in a single instruction, but not 32b.
However in most cases 24b is sufficient for address/offset calculation.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Eduardo Lima Mitev 32e5fbf47c nir: Add a new ALU nir_op_imad24_ir3
ir3 compiler has a signed integer multiply-add instruction (MAD_S24)
that is used for different offset calculations in the backend.
Since we intend to move some of these calculations to NIR, we need
a new ALU op that can directly represent it.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2019-10-18 15:08:54 -07:00
Andres Gomez d9760f8935 nir/opcodes: Clear variable names confusion
Having Python and C variables sharing name in the same block of code
makes its understanding a bit confusing. Make it explicit that the
Python bit_size variable refers to the destination bit size.

Suggested-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-09-18 23:59:07 +03:00
Samuel Iglesias Gonsálvez dba4d0a319 nir: fix fmin/fmax support for doubles
Until now, it was using the floating point version of fmin/fmax,
instead of the double version.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez 1e0e3ed15a nir: fix denorms in unpack_half_1x16()
According to VK_KHR_shader_float_controls:

"Denormalized values obtained via unpacking an integer into a vector
 of values with smaller bit width and interpreting those values as
 floating-point numbers must: be flushed to zero, unless the entry
 point is declared with the code:DenormPreserve execution mode."

v2:
- Add nir_op_unpack_half_2x16_flush_to_zero opcode (Connor).

v3:
- Adapt to use the new NIR lowering framework (Andres).

v4:
- Updated to renamed shader info member and enum values (Andres).

v5:
- Simplify flags logic operations (Caio).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v2]
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez ef681cf971 nir/opcodes: make sure f2f16_rtz and f2f16_rtne behavior is not overriden by the float controls execution mode
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez 7580707345 nir: mind rounding mode on fadd, fsub, fmul and fma opcodes
According to Vulkan spec, the new execution modes affect only
correctly rounded SPIR-V instructions, which includes fadd, fsub and
fmul.

v2:
- Fix fmul, fsub and fadd round-to-zero definitions, they should use
  auxiliary functions to calculate the proper value because Mesa uses
  round-to-nearest-even rounding mode by default (Connor).

v3:
- Do an actual fused multiply-add at ffma (Connor).

v4:
- Simplify fadd and fmul for bit sizes < 64 (Connor).
- Do not use double ffma for 32 bits float (Connor).

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com> [v3]
2019-09-17 23:39:18 +03:00
Samuel Iglesias Gonsálvez 0ac07c7ca7 nir: add support for round to zero rounding mode to nir_op_f2f32
f2f16's rounding modes are already handled and f2f64 don't need it
as there is not a floating point type with higher bit size than 64 for
now.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-09-17 23:39:18 +03:00
Erico Nunes 4a407df682 nir/algebraic: add new fsum ops and fdot lowering
The Mali400 pp doesn't implement fdot but has fsum3 and fsum4, which can
be used to optimize fdot lowering. fsum2 is not implemented and can be
further lowered to an add with the vector components.
Currently lima ppir handles this lowering internally, however this
happens in a very late stage and requires a big chunk of code compared
to a nir_opt_algebraic lowering.
By having fsum in nir, we can reduce ppir complexity and enable the
lowered ops to be part of other nir optimizations in the optimization
loop.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-07-31 21:35:58 +02:00
Sagar Ghuge 81d342e2a1 nir: Add urol and uror opcodes
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-01 10:14:22 -07:00
Jonathan Marek a70ff70158 nir: remove fnot/fxor/fand/for opcodes
There doesn't seem to be any reason to keep these opcodes around:
* fnot/fxor are not used at all.
* fand/for are only used in lower_alu_to_scalar, but easily replaced

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-06-26 15:26:10 -04:00
Daniel Schürmann a8b0b6e52b nir: introduce lowering of bitfield_insert to bfm and a new opcode bitfield_select.
bitfield_select is defined as:
bitfield_select(mask, base, insert) = (mask & base) | (~mask & insert)
matching the behavior of AMD's BFI instruction.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Daniel Schürmann 165b7f3a44 nir: define behavior of nir_op_bfm and nir_op_u/ibfe according to SM5 spec.
That is: the five least significant bits provide the values of
'bits' and 'offset' which is the case for all hardware currently
supported by NIR and using the bfm/bfe instructions.
This patch also changes the lowering of bitfield_insert/extract
using shifts to not use bfm and removes the flag 'lower_bfm'.

Tested-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2019-06-24 18:42:20 +02:00
Eduardo Lima Mitev c27b3758fa nir/opcodes: Add new 'umul_low' and 'imadsh_mix16' opcodes
'umul_low' is the low 32-bits of unsigned integer multiply. It maps
directly to ir3's MULL_U.

'imadsh_mix16' is multiply add with shift and mix, an ir3 specific
instruction that maps directly to ir3's IMADSH_M16.

Both are necessary for the lowering of integer multiplication on
Freedreno, which will be introduced later in this series.

Reviewed-by: Eric Anholt <eric@anholt.net>
2019-06-07 08:45:05 +02:00
Jason Ekstrand f2dc0f2872 nir: Drop imov/fmov in favor of one mov instruction
The difference between imov and fmov has been a constant source of
confusion in NIR for years.  No one really knows why we have two or when
to use one vs. the other.  The real reason is that they do different
things in the presence of source and destination modifiers.  However,
without modifiers (which many back-ends don't have), they are identical.
Now that we've reworked nir_lower_to_source_mods to leave one abs/neg
instruction in place rather than replacing them with imov or fmov
instructions, we don't need two different instructions at all anymore.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Acked-by: Rob Clark <robdclark@chromium.org>
2019-05-24 08:38:11 -05:00
Ian Romanick 7b4ff6a1af nir: Mark ffma as 2src_commutative
This doesn't make any real difference now, but future work (not in this
series) will add a LOT of ffma patterns.  Having to duplicate all of
them for ffma(a, b, c) and ffma(b, a, c) is just terrible.

No shader-db changes on any Intel platform.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-05-14 11:25:02 -07:00
Ian Romanick ede45bf9cf nir: Rename commutative to 2src_commutative
The meaning of the new name is that the first two sources are
commutative.  Since this is only currently applied to two-source
operations, there is no change.

A future change will mark ffma as 2src_commutative.

It is also possible that future work will add 3src_commutative for
opcodes like fmin3.

v2: s/commutative_2src/2src_commutative/g.  I had originally considered
this, but I discarded it because I did't want to deal with identifiers
that (should) start with 2.  Jason suggested it in review, so we decided
that _2src_commutative would be used in nir_opcodes.py.  Also add some
comments documenting what 2src_commutative means.  Also suggested by
Jason.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-05-14 11:25:02 -07:00
Ian Romanick 85e6865ff6 nir: Saturating integer arithmetic is not associative
In 8-bits,

    iadd_sat(iadd_sat(0x7f, 0x7f), -1) =
    iadd_sat(0x7f, -1) =
    0x7e

but,

    iadd_sat(0x7f, iadd_sat(0x7f, -1)) =
    iadd_sat(0x7f, 0x7e) =
    0x7f

Fixes: 272e927d0e ("nir/spirv: initial handling of OpenCL.std extension opcodes")
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-05-01 09:07:47 -07:00
Kristian H. Kristensen 41593f3c37 nir_opcodes.py: Saturate to expression that doesn't overflow
Compiler warns about overflow when assigning UINT64_MAX to something
smaller than a uin64_t:

src/compiler/nir/nir_constant_expressions.c:16909:50: warning: implicit conversion from 'unsigned long long' to 'uint1_t' (aka 'unsigned char') changes value from 18446744073709551615 to 255 [-Wconstant-conversion]
            uint1_t dst = (src0 + src1) < src0 ? UINT64_MAX : (src0 + src1);
                    ~~~                          ^~~~~~~~~~

Shift UINT64_MAX down to the appropriate maximum value for the type
being assigned to.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-04-19 16:17:37 +00:00
Rhys Perry 8671cfe2a2 nir,ac/nir: fix cube_face_coord
Seems it was missing the "/ ma + 0.5" and the order was swapped.

Fixes: a1a2a8dfda ('nir: add AMD_gcn_shader extended instructions')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-15 17:22:47 +01:00
Samuel Pitoiset 6ae5797243 nir: use generic float types for frexp_exp and frexp_sig
Only the exponent needs to be 32-bit signed integer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-22 19:41:44 +01:00
Iago Toral Quiroga ca2b5e9069 compiler/nir: add an is_conversion field to nir_op_info
This is set to True only for numeric conversion opcodes.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-03-06 17:24:57 +00:00
Karol Herbst 272e927d0e nir/spirv: initial handling of OpenCL.std extension opcodes
Not complete, mostly just adding things as I encounter them in CTS. But
not getting far enough yet to hit most of the OpenCL.std instructions.

Anyway, this is better than nothing and covers the most common builtins.

v2: add hadd proof from Jason
    move some of the lowering into opt_algebraic and create new nir opcodes
    simplify nextafter lowering
    fix normalize lowering for inf
    rework upsample to use nir_pack_bits
    add missing files to build systems
v3: split lines of iadd/sub_sat expressions

Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-05 22:28:29 +01:00
Sagar Ghuge e551040c60 nir/glsl: Add another way of doing lower_imul64 for gen8+
On Gen 8 and 9, "mul" instruction supports 64 bit destination type. We
can reduce our 64x64 int multiplication from 4 instructions to 3.

Also instead of emitting two mul instructions, we can emit single mul
instuction and extract low/high 32 bits from 64 bit result for
[i/u]mulExtended

v2: 1) Allow lower_mul_high64 to use new opcode (Jason Ekstrand)
    2) Add lower_mul_2x32_64 flag (Matt Turner)
    3) Remove associative property as bit size is different (Connor
       Abbott)

v3: Fix indentation and variable naming convention (Jason Ekstrand)

Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-03-04 15:50:25 -08:00
Daniel Schürmann 0525bdc225 nir: Define shifts according to SM5 specification.
SPIR-V shifts are undefined for values >= bitsize, but SM5 shifts
are defined to only use the least significant bits.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-02-25 12:59:43 -06:00
Jason Ekstrand 191a1dce92 nir: Add 1-bit Boolean opcodes
We also have to add support for 1-bit integers while we're here so we
get 1-bit variants of iand, ior, and inot.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Jason Ekstrand 80e8dfe9de nir: Rename Boolean-related opcodes to include 32 in the name
This is a squash of a bunch of individual changes:

    nir/builder: Generate 32-bit bool opcodes transparently

    nir/algebraic: Remap Boolean opcodes to the 32-bit variant

    Use 32-bit opcodes in the NIR producers and optimizations

        Generated with a little hand-editing and the following sed commands:

        sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
        sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
        sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
        sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
        sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
        sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c

     Use 32-bit opcodes in the NIR back-ends

        Generated with a little hand-editing and the following sed commands:

        sed -i 's/nir_op_ball_fequal/nir_op_b32all_fequal/g' **/*.c
        sed -i 's/nir_op_bany_fnequal/nir_op_b32any_fnequal/g' **/*.c
        sed -i 's/nir_op_ball_iequal/nir_op_b32all_iequal/g' **/*.c
        sed -i 's/nir_op_bany_inequal/nir_op_b32any_inequal/g' **/*.c
        sed -i 's/nir_op_\([fiu]lt\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ge\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]ne\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fiu]eq\)/nir_op_\132/g' **/*.c
        sed -i 's/nir_op_\([fi]\)ne32g/nir_op_\1neg/g' **/*.c
        sed -i 's/nir_op_bcsel/nir_op_b32csel/g' **/*.c

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-16 21:03:02 +00:00
Ian Romanick 090e282407 nir: Add a saturated unsigned integer add opcode
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-12-13 17:49:48 +00:00
Jason Ekstrand 9525971e2b nir: Allow [iu]mul_high on non-32-bit types
Reviewed-by: Ian Romanick ian.d.romanick@intel.com
2018-12-13 17:49:48 +00:00
Jason Ekstrand dca6cd9ce6 nir: Make boolean conversions sized just like the others
Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is
one if 8, 16, 32, or 64.  This leads to having a few more opcodes but
now everything is consistent and booleans aren't a weird special case
anymore.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:03:07 -06:00
Jason Ekstrand 85f0ea9d8f nir/opcodes: Rename tbool to tbool32
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:49 -06:00
Jason Ekstrand 03571a7a6c nir/opcodes: Pull in the type helpers from constant_expressions
While we're at it, we rework them a bit to all use regular expressions
and assert more.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2018-12-05 15:02:06 -06:00
Jason Ekstrand 80c424148b nir/opcodes: Make unpack_half_2x16_split_* variable-width
There is nothing inherent about these opcodes that requires them to only
take scalars.  It's very convenient if we let them take vectors as well.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2018-08-29 14:04:02 -05:00
Karol Herbst 7f95564a22 nir: rename f2f16_undef to f2f16
we need rounding modes on other conversions involving floats and it is easier
to rename f2f16_undef than renaming all the other ones.

v2: rebased on master

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Rob Clark <robdclark@gmail.com>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
2018-07-24 20:40:05 +02:00
Mathieu Bridon 9ebd8372b9 python: Use range() instead of xrange()
Python 2 has a range() function which returns a list, and an xrange()
one which returns an iterator.

Python 3 lost the function returning a list, and renamed the function
returning an iterator as range().

As a result, using range() makes the scripts compatible with both Python
versions 2 and 3.

Signed-off-by: Mathieu Bridon <bochecha@daitauha.fr>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-07-24 11:07:04 -07:00
Iago Toral Quiroga c9653cc14c nir: add opcodes for 16-bit packing and unpacking
Noitice that we don't need 'split' versions of the 64-bit to / from
16-bit opcodes which we require during pack lowering to implement these
operations. This is because these operations can be expressed as a
collection of 32-bit from / to 16-bit and 64-bit to / from 32-bit
operations, so we don't need new opcodes specifically for them.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2018-05-03 11:40:26 +02:00
Daniel Schürmann 4d802df3aa nir: subgroups instructions for 64bit ballot sizes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-04-14 00:52:22 +02:00
Dave Airlie 3e830a1af2 nir: add support for min/max/median of 3 srcs
These are needed for SPV_AMD_shader_trinary_minmax,
the AMD HW supports these.

Co-authored-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-03-29 01:28:58 +02:00
Timothy Arceri cca2141745 nir: add frexp_exp and frexp_sig opcodes
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-22 12:42:34 +11:00
Daniel Schürmann a1a2a8dfda nir: add AMD_gcn_shader extended instructions
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-03-07 23:09:58 +01:00
Timothy Arceri 0c1f37cc2d nir: fix interger divide by zero crash during constant folding
From the GLSL 4.60 spec Section 5.9 (Expressions):

   "Dividing by zero does not cause an exception but does result in
    an unspecified value."

Fixes: 89285e4d47 "nir: add new constant folding infrastructure"

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105271
2018-02-28 15:55:39 +11:00
James Legg 947470d10b nir/opcodes: Fix constant-folding of bitfield_insert
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104119
CC: <mesa-stable@lists.freedesktop.org>
CC: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2017-12-07 08:59:36 +00:00
Jose Maria Casanova Crespo 1f440d00d2 nir: Handle fp16 rounding modes at nir_type_conversion_op
nir_type_conversion enables new operations to handle rounding modes to
convert to fp16 values. Two new opcodes are enabled nir_op_f2f16_rtne
and nir_op_f2f16_rtz.

The undefined behaviour doesn't has any effect and uses the original
nir_op_f2f16 operation.

v2: Indentation fixed (Jason Ekstrand)

v3: Use explicit case for undefined rounding and assert if
    rounding mode is used for non 16-bit float conversions
    (Jason Ekstrand)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-06 08:57:18 +01:00
Jason Ekstrand a0947921eb nir/opcodes: Fix constant-folding of ufind_msb
We didn't fold correctly in the case of 0x1 because we never let the
loop counter hit 0.  Switching it to bit >= 0 solves this problem.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2017-10-12 22:39:29 -07:00
Matt Turner 50e4099edf nir: Remove series of unnecessary conversions
Clang warns:

warning: absolute value function 'fabsf' given an argument of type
'const float64_t' (aka 'const double') but has parameter of type 'float'
which may cause truncation of value [-Wabsolute-value]

            float64_t dst = bit_size == 64 ? fabs(src0) : fabsf(src0);

The type of the ternary expression will be the common type of fabs() and
fabsf(): double. So fabsf(src0) will be implicitly converted to double.
We may as well just convert src0 to double before a call to fabs() and
remove the needless complexity, à la

            float64_t dst = fabs(src0);

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-08-29 15:20:57 -07:00
Juan A. Suarez Romero 4195a9450b nir: sge operation is defined for floating-point types
According to GLSL.std.450 spec, the operand for step() function must be
a floating-point. It does not restrict the value to 32-bit floats.

Reviewed by: Elie Tournier <elie.tournier@collabora.com>
2017-06-27 12:01:11 +02:00
Jason Ekstrand fbcf92a278 nir: Add support for 8 and 16-bit types
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>
2017-03-30 11:34:45 -07:00
Vinson Lee 1fa432741c nir: Add positional argument specifiers.
Fix build with Python < 2.7.

  File "src/compiler/nir/nir_builder_opcodes_h.py", line 46, in <module>
    from nir_opcodes import opcodes
  File "src/compiler/nir/nir_opcodes.py", line 178, in <module>
    unop_convert("{}2{}{}".format(src_t[0], dst_t[0], bit_size),
ValueError: zero length field name in format

Fixes: 762a6333f2 ("nir: Rework conversion opcodes")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2017-03-21 13:38:00 -07:00
Jason Ekstrand 762a6333f2 nir: Rework conversion opcodes
The NIR story on conversion opcodes is a mess.  We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random.  This commit re-organizes things and makes them all
consistent:

 - All non-bool conversion opcodes now have the explicit size in the
   destination and are named <src_type>2<dst_type><size>.

 - Integer <-> integer conversion opcodes now only come in i2i and u2u
   forms (i2u and u2i have been removed) since the only difference
   between the different integer conversions is whether or not they
   sign-extend when up-converting.

 - Boolean conversion opcodes all have the explicit size on the bool and
   are named <src_type>2<dst_type>.

Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako.  This will make adding
int8, int16, and float16 versions much easier when the time comes.

Reviewed-by: Eric Anholt <eric@anholt.net>
2017-03-14 07:36:40 -07:00
Emil Velikov e4c7911150 nir: remove shebang from python scripts
Analogous to earlier commit(s).

Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-03-10 14:12:47 +00:00
Jason Ekstrand 161d3e81be nir: Combine the int and double [un]pack opcodes
NIR is a typeless IR and the two opcodes, when considered bitwise, do
exactly the same thing.  There's no reason to have two versions.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2017-02-16 17:28:03 -08:00
Samuel Iglesias Gonsálvez 824e1bb078 nir: add opcode to perform int64 to bool conversions
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2017-02-09 10:18:34 +01:00
Ian Romanick fda33e09d8 nir: Shift count for shift opcodes is always 32-bits
Previously both sources were unsized.  This caused problems when the
thing being shifted was 64-bit but the shift count was 32-bit.  The
expectation in NIR is that all unsized sources (and destination) will
ultimately have the same size.

The changes in nir_opt_algebraic.py are to prevent errors like:

 Failed to parse transformation:
03:12:25   (('extract_i8', 'a', 'b'), ('ishr', ('ishl', 'a', ('imul', ('isub', 3, 'b'), 8)), 24), 'options->lower_extract_byte')
03:12:25 Traceback (most recent call last):
03:12:25   File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 610, in __init__
03:12:25     xform = SearchAndReplace(xform)
03:12:25   File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 495, in __init__
03:12:25     BitSizeValidator(varset).validate(self.search, self.replace)
03:12:25   File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 311, in validate
03:12:25     validate_dst_class = self._validate_bit_class_up(replace)
03:12:25   File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 414, in _validate_bit_class_up
03:12:25     src_class = self._validate_bit_class_up(val.sources[i])
03:12:25   File "/home/jenkins/workspace/Leeroy_2/repos/mesa/src/compiler/nir/nir_algebraic.py", line 420, in _validate_bit_class_up
03:12:25     assert src_class == src_type_bits
03:12:25 AssertionError

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Suggested-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Cc: Jason Ekstrand <jason@jlekstrand.net>
2017-01-20 15:41:23 -08:00
Ian Romanick 3460d05a71 nir: Add 64-bit integer support for conversions and bitcasts
v2 (idr): "cut them down later" => Remove ir_unop_b2u64 and
ir_unop_u642b.  Handle these with extra i2u or u2i casts just like
uint(bool) and bool(uint) conversion is done.

v3 (idr): Make the "from" type in a cast unsized.  This reduces the
number of required cast operations at the expensive slightly more
complex code.  However, this will be a dramatic improvement when other
sized integer types are added.  Suggested by Connor.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2017-01-20 15:41:23 -08:00
Ilia Mirkin 8c8874eafb nir: fix definition of pack_uvec2_to_uint
Found by inspection. Untested beyond compilation. This also matches the
logic used in nir_lower_alu_to_scalar.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Cc: mesa-stable@lists.freedesktop.org
2016-09-06 22:45:44 -04:00
Iago Toral Quiroga 38b719d624 nir: handle double-precision in fsign, fsat, fnot and frcp
I think these are not strictly necessary since the floats in them
should be automatically promoted to doubles when operated with
double sources, but it makes things more explicit at least.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-23 08:54:37 +02:00
Iago Toral Quiroga 3f73039ade nir: handle double-precision in fabs, frsq and fsqrt
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-05-23 08:54:28 +02:00
Rob Clark 8b24f7b440 nir: fix comment typo about f2d/d2f
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-15 17:25:47 -04:00
Jason Ekstrand f0af5b87ec nir/opcodes: Make ldexp take an explicitly 32-bit int
There is no sense in having the double version of ldexp take a 64-bit
integer.  Instead, let's just take a 32-bit int all the time.  This also
matches what GLSL does where both variants of ldexp take a regular integer
for the exponent argument.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-28 21:36:52 -07:00
Jason Ekstrand bee40dd730 nir/opcodes: Simplify the expressions for [un]pack_double
The new expressions are more explicit in terms of where the bits go so it's
a little easier to tell what's going on.  This is the way GLSL specifies
things so it's a bit easier to verify too.  It also has the benifit that
the new expressions easily vectorize so we can constant-fold vector forms
of the _split versions correctly.

Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
2016-04-28 21:36:52 -07:00
Jason Ekstrand 745b3d295e nir: Add more modulus opcodes
These are all needed for SPIR-V

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2016-04-13 15:44:00 -07:00
Connor Abbott 663e6421df nir: add split versions of (un)pack_double_2x32
v2 (Sam):
- Use uint64 instead of float64 for sources and destinations. (Connor)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Connor Abbott 9e31e0a21b nir: add support for (un)pack_double_2x32
v2 (Sam):
- Use uint64 instead of float64 for sources and destinations. (Connor)

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Iago Toral Quiroga d5d6260329 nir: add i2d and u2d opcodes
v2:
- Assert supports_int and don't fallback to nir_fmov (Jason)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Iago Toral Quiroga b16d06252e nir: add d2i, d2u, d2b opcodes
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Connor Abbott a4bce07dc6 nir: add support for d2f and f2d
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-11 08:29:27 +02:00
Jason Ekstrand de60e250f5 nir: Add an opcode for stomping a 32-bit value to 16-bit precision
This correlates directly to the SPIR-V opcode OpQuantizeToF16

Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-01 13:52:28 -07:00
Connor Abbott 9076c4e289 nir: update opcode definitions for different bit sizes
Some opcodes need explicit bitsizes, and sometimes we need to use the
double version when constant folding.

v2: fix output type for u2f (Iago)

v3: do not change vecN opcodes to be float. The next commit will add
    infrastructure to enable 64-bit integer constant folding so this is isn't
    really necessary. Also, that created problems with source modifiers in
    some cases (Iago)

v4 (Jason):
  - do not change bcsel to work in terms of floats
  - leave ldexp generic

Squashed changes to handle different bit sizes when constant
folding since otherwise we would break the build.

v2:
- Use the bit-size information from the opcode information if defined (Iago)
- Use helpers to get type size and base type of nir_alu_type enum (Sam)
- Do not fallback to sized types to guess bit-size information. (Jason)

Squashed changes in i965 and gallium/nir drivers to support sized types.
These functions should only see sized types, but we can't make that change
until we make sure that nir uses the sized versions in all the relevant places.
A later commit will address this.

Signed-off-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-03-17 11:16:33 +01:00
Matt Turner 9b8786eba9 nir: Add lowering support for packing opcodes.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-01 10:43:57 -08:00
Matt Turner 68f8c5730b nir: Add opcodes to extract bytes or words.
The uint versions zero extend while the int versions sign extend.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-01 10:43:57 -08:00
Matt Turner 140a886c41 nir: Make argument order of unop_convert match binop_convert.
Strangely the return and parameter types were reversed.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2016-02-01 10:43:57 -08:00
Emil Velikov a39a8fbbaa nir: move to compiler/
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Matt Turner <mattst88@gmail.com>
Acked-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-26 16:08:30 +00:00
Renamed from src/glsl/nir/nir_opcodes.py (Browse further)