KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Rhys Perry	bb0415b697	nir: allow 16-bit fsin_amd/fcos_amd Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10587>	2022-07-07 22:18:08 +00:00
Rhys Perry	69d21a3dee	nir: rename fsin_r600/fcos_r600 to fsin_amd/fcos_amd GCN has better range, but constant folding is the same. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10587>	2022-07-07 22:18:08 +00:00
Iago Toral Quiroga	84a0dca9df	nir: fix documentation for uadd_carry and usub_borry opcodes These opcodes where fixed to return an integer instead of a boolean value some time ago but the documentation for them was not updated and still talked about a boolean result. Fixes: `b0d4ee520` ('nir/opcodes: Fix up uadd_carry and usub_borrow') Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17372>	2022-07-07 09:16:24 +00:00
Ian Romanick	fd1f2d3b5a	nir: Add and use algebraic property "is selection" There are several places that should have supported the various sized versions of bcsel and the various nir_op_[fi]csel_* opcodes. Rather than enumerate the whole list, add a property. v2: Make the comment for NIR_OP_IS_SELECTION more descriptive. Suggested by Jason. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>	2022-06-22 19:26:59 +00:00
Ian Romanick	ccd18ec4f3	nir: i32csel opcodes should compare with integer zero Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Noticed-by: Georg Lehmann <dadschoorse@gmail.com> Fixes: `0f5b3c37c5` ("nir: Add opcodes for fused comp + csel and optimizations") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>	2022-06-22 19:26:59 +00:00
Jason Ekstrand	4b67d70d22	nir: Fix constant folding for non-32-bit ifind_msb and clz Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16348>	2022-05-10 03:37:44 +00:00
Jason Ekstrand	5c9e4d400a	nir/opcodes: fisfinite32 should return bool32 Otherwise constant-folding will fold it to 0/1 instead of 0/~0. Fixes: `330e28155f` ("nir: add 32-bit bool of fisfinite") Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15984>	2022-04-16 02:46:12 +00:00
Samuel Pitoiset	6532307555	nir: introduce nir_pack_{sint,uint}_2x16 instructions These instructions have AMD hardware equivalent and they will be used to lower fragment shader outputs in NIR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>	2022-03-04 08:06:56 +00:00
Emma Anholt	b1f349dff4	nir: Allow the _replicates opcodes to have num_components != 4. This required relaxing a core NIR assertion which I don't think is doing any important validation. The shader-db effects here are small, but they're important for avoiding a regression when we start doing per-component DCE in opt_shrink_vectors (https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468) softpipe shader-db: total instructions in shared programs: 2859777 -> 2859454 (-0.01%) instructions in affected programs: 18881 -> 18558 (-1.71%) total temps in shared programs: 293994 -> 293914 (-0.03%) temps in affected programs: 418 -> 338 (-19.14%) i915g: total instructions in shared programs: 407562 -> 407544 (<.01%) instructions in affected programs: 570 -> 552 (-3.16%) r300: total instructions in shared programs: 1414450 -> 1414459 (<.01%) instructions in affected programs: 44494 -> 44503 (0.02%) total vinst in shared programs: 473782 -> 473727 (-0.01%) vinst in affected programs: 1102 -> 1047 (-4.99%) total sinst in shared programs: 231224 -> 231216 (<.01%) sinst in affected programs: 432 -> 424 (-1.85%) total temps in shared programs: 197605 -> 197607 (<.01%) temps in affected programs: 103 -> 105 (1.94%) crocus hsw: total instructions in shared programs: 8158185 -> 8158134 (<.01%) instructions in affected programs: 10927 -> 10876 (-0.47%) Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15178>	2022-02-25 12:31:48 -08:00
Ian Romanick	38800b385c	nir: All set-on-comparison opcodes can take all float types Extend `4195a9450b` so that the next poor fool doesn't come along and say, "sge does the right thing for 16-bit sources, but slt gives a NIR validation failure. What the deuce?" NOTE: This commit is necessary to prevent regressions in GLSLstd450Step tests of 16-bit sources at "spriv: Produce correct result for GLSLstd450Step with NaN". Fixes: `4195a9450b` ("nir: sge operation is defined for floating-point types") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>	2022-02-10 18:15:39 +00:00
Rhys Perry	7f05ea3793	nir: add nir_op_fmulz and nir_op_ffmaz Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13436>	2022-01-20 22:54:42 +00:00
Samuel Pitoiset	011ea32585	nir: fix constant expression of ibitfield_extract This fixes dEQP-VK.graphicsfuzz.cov-condition-bitfield-extract-integer. For example, nir_ibitfield_extract(3, 1, 2) should return 1. Cc: 21.3 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13791>	2021-11-16 17:32:21 +00:00
Alyssa Rosenzweig	3e8f540753	nir: Add Mali-specific derivative opcodes Add derivative opcodes fddx_must_abs_mali/fddy_must_abs_mali satisfying: fabs(fdd_must_abs_mali(v)) = fabs(fdd(v)) The sign of their result is undefined. On Bifrost and Valhall, these unsigned derivatives can be implemented more efficiently than the correctly-signed counterparts, since the sign fixup requires extra ALU instructions. On backends where this is the case, it is useful to optimize fabs(fdd(v)) to fabs(fdd_must_abs_mali(v)). This pattern comes up with the GLSL builtin `fwidth`. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12332>	2021-10-06 00:40:57 +00:00
Rhys Perry	41ecef7855	nir: add sdot_2x16 and udot_2x16 opcodes Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12617>	2021-09-03 13:21:27 +00:00
Timur Kristóf	33630090a2	nir: Add comment to explain the sad_u8x4 opcode. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12649>	2021-09-01 08:42:03 +00:00
Filip Gawin	46f3582c6f	nir: fix ifind_msb_rev by using appropriate type As you can see comparion "x < 0" doesn't make sense if x is unsigned. Fixes: `a5747f8a` ("nir: add opcodes for *find_msb_rev and lowering ") Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12548>	2021-08-26 18:35:31 +00:00
Ian Romanick	6c18a3b497	nir/opcodes: Add integer dot-product opcodes Six opcodes are added: sdot_4x8_iadd, udot_4x8_uadd, sudot_4x8_iadd, sdot_4x8_iadd_sat, udot_4x8_uadd_sate, and sudot_4x8_iadd_sat. These represent the combinations of integer dot-product and add that operate on packed source vectors. That is, the four 8-bit values for each vector is stored in a single 32-bit integer. Some hardware may prefer to operate on unpacked byte vectors. When such hardware comes to Mesa, we'll have to figure out how to name things. v2: Add nir_op_iudp4a and nir_op_iudp4a_sat instructions. These opcodes are not 2-source commutative. v3: Rename all opcodes to be more like some existing 4x8 opcodes. Suggested by Timur. Change type of packed vector sources to uint32, change types of constant folding variables to have explicit size, and delete some extra casts. All suggested by Jason. v4: Fix typo previously noticed by Alyssa but missed in v2. v5: Add has_sudot_4x8 flag. Requested by Rhys. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>	2021-08-24 19:58:57 +00:00
Ian Romanick	f0a8a9816a	nir: intel/compiler: Add and use nir_op_pack_32_4x8_split A lot of CTS tests write a u8vec4 or an i8vec4 to an SSBO. This results in a lot of shifts and MOVs. When that pattern can be recognized, the individual 8-bit components can be packed much more efficiently. v2: Rebase on `b4369de27f` ("nir/lower_packing: use shader_instructions_pass") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>	2021-08-18 22:03:37 +00:00
Vinson Lee	8d679f4f4e	nir: Initialize evaluate_cube_face_index_amd dst.x. Fix defect reported by Coverity Scan. Uninitialized scalar variable (UNINIT) uninit_use: Using uninitialized value dst.x. Fixes: `a1a2a8dfda` ("nir: add AMD_gcn_shader extended instructions") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12290>	2021-08-12 23:13:52 -07:00
Ian Romanick	3ba66ebbc8	nir/opcodes: Use u_intN_(min\|max) uadd_sat was updated using sed, so I didn't even notice the surrounding opcodes. Oops. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12297>	2021-08-10 22:16:13 +00:00
Rhys Perry	e008eb1224	nir: fix signed overflow for iadd constant folding Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12039>	2021-08-09 11:00:39 +00:00
Dave Airlie	330e28155f	nir: add 32-bit bool of fisfinite Add the bool lowering as well. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12207>	2021-08-06 12:06:21 +10:00
Ian Romanick	72259a870f	util: Add and use functions to calculate min and max int for a size Many places need to know the maximum or minimum possible value for a given size integer... so everyone just open-codes their favorite version. There is some potential to hit either undefined or implementation-defined behavior, so having one version that Just Works seems beneficial. v2: Fix copy-and-pasted bug (INT64_MAX instead of INT64_MIN) in u_intmin. Noticed by CI. Lol. Rename functions `s/u_(uint\|int)(min\|max)/u_\1N_\2/g`. Suggested by Jason. Add some unit tests that would have caught the copy-and-paste bug before wasting CI time. Change the implementation of u_intN_min to use the same pattern as stdint.h. This avoids the integer division. Noticed by Jason. v3: Add changes to convert_clear_color (src/gallium/drivers/iris/iris_clear.c). Suggested by Nanley. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12177>	2021-08-03 12:55:02 -07:00
Sagar Ghuge	e8dff256c0	nir: Add new opcode for ternary addition v2: - Make it 2src commutative (Connor Abbott) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>	2021-07-16 15:59:55 +00:00
Thomas H.P. Andersen	ffea622604	nir/ifind_msb_rev: fix input check ifind_msb_rev was introduced in `a5747f8ab3`. ifind_msb_rev guards against src0 being both 0 or -1 at the same time. That is always true. This patch changes it to check for those values individually. Spotted from a compile warning. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Fixes: `a5747f8ab3` (\"nir: add opcodes for *find_msb_rev and lowering\") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11630>	2021-07-04 12:17:58 +00:00
Alyssa Rosenzweig	3da23a9c7e	nir: Fix constant folding for irhadd/urhadd This should be a subtract, not an add. The comment's proof is correct, but the (wrong) expression we actually use isn't what it's in the comment! Correct the discrepancy. The lowering in nir_opt_algebraic was correctly typed. Fixes: `272e927d0e` ("nir/spirv: initial handling of OpenCL.std extension opcodes") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11671>	2021-07-02 00:21:22 +00:00
Jason Ekstrand	f00b5a30f5	nir: Require vectorized ALU ops to be all-or-nothing Long ago, the semantics of bcsel were such that it took a single boolean value and selected between whole vectors. These days, it takes a vector boolean with the assumption that if you want the old behavior you can just use a .xxxx swizzle. There currently are no opcodes which use a output_size of 0 but have a scalar or fixed-vector input. Let's disallow it for now to force us to think through the semantics again if this ever comes up as something someone actually wants. Reviewed-by: Connor Abbott <cwabbott0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11438>	2021-06-21 16:46:59 +00:00
Jason Ekstrand	2e08bae9b3	nir,vc4: Suffix a bunch of unorm 4x8 opcodes _vc4 Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11463>	2021-06-21 09:04:08 -05:00
Jason Ekstrand	0afbfee8da	nir,panfrost: Suffix fsat_signed and fclamp_pos with _mali Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11463>	2021-06-21 09:03:34 -05:00
Jason Ekstrand	f0f713960b	nir,amd: Suffix nir_op_cube_face_coord/index with _amd Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11463>	2021-06-21 09:03:34 -05:00
Timur Kristóf	c92dab8e2b	nir: Add nir_op_sad_u8x4 which corresponds to AMD's v_sad_u8. NIR currently doesn't have any intrinsics for a horizontal packed add, so this one is modeled after AMD's v_sad_u8. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Rhys Perry	1cbcfb8b38	nir, nir/algebraic: add byte/word insertion instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Jesse Natalie	d7ca0319d7	nir: Add relaxed 24bit opcodes These are equivalent to the 32bit opcodes if there are no more efficient 24bit opcodes available, but inputs are guaranteed to already be 24bit, so the 24bit opcodes can be used instead if they exist and are efficient. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10549>	2021-05-05 22:06:42 +00:00
Alyssa Rosenzweig	a976101da5	nir/opcodes: Reword confusing comment Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10578>	2021-05-03 12:51:47 +00:00
Alyssa Rosenzweig	0ea67e57e5	nir: Add fsin_agx opcode Used to split up the fsin/fcos lowering for AGX between NIR and the backend, to permit algebraic optimizations without polluting NIR with too many hardware details. The backend NIR lowering produces an fmul/ffma of the input so we can optimize code like sin(2*x). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10582>	2021-05-02 17:41:09 -04:00
Jesse Natalie	3c8bcdc863	nir: Add a new opcode for [un]packing doubles HLSL doesn't support bitcasting a 64bit integer to a double. DXIL doesn't have generic pack/unpack instructions, so we lower those to integer bitwise ops. As a result, NIR generic double pack/unpack would require our backend to emit a bitcast to get a double, but we want to match HLSL semantics and emit MakeDouble/SplitDouble. Adding a dedicated opcode for double pack/unpack allows us to add a pass to emit that instead, which lets our backend emit the right instruction to pack and unpack doubles. Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10063>	2021-04-09 01:54:33 +00:00
Gert Wollny	318701b803	nir: Add r600 specific sin and cos variants r600 expect the input values to be normalited by divinding by 2 *PI, so add an opcode to be able to lower this in nir. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9452>	2021-03-22 15:19:46 +01:00
Gert Wollny	0f5b3c37c5	nir: Add opcodes for fused comp + csel and optimizations Some backends, like r600 support a fused version of int and float compare against zero and and csel. Adding these opcodes here makes it possible to optimize this in nir. v2: Add rules for float compare + csel Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9452>	2021-03-22 15:19:46 +01:00
Gert Wollny	a5747f8ab3	nir: add opcodes for find_msb_rev and lowering Some hardware supports a version of find_msb where the bits are counted starting at the high bit, and this needs some lowering to obtain the value that is expected by find_msb Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9452>	2021-03-22 15:19:46 +01:00
Gert Wollny	e5db9c3dd4	nir: Add r600 specific CUBE opcode to evaluate cube texture coords and face The opcode evaluates tha unnormalized coordinates, the length of the major axis, and the cube face. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9200>	2021-02-26 09:51:37 +01:00
Rhys Perry	95819663b7	nir: allow 5 component vectors These will be useful for sparse texture instructions and image load intrinsics. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7774>	2021-01-06 20:36:38 +00:00
Ian Romanick	71961c73a9	nir: Correctly constant fold fsign(NaN) and fsign(-0) GLSL and SPIR-V GLSL.std.450 don't have any requirements for fsign(NaN), and both only require that FSign(-0.0) == 0.0. OpenCL, on the other hand, requires sign(-0.0) be exactly -0.0. It also requires that sign(NaN) be exactly 0.0. In practice, this change is difficult to test. Our GLSL frontend already constant folds sign(NaN) to 0.0 before even getting to NIR. As far as I can tell, glslang does the same. I don't have a good way to run an OpenCL SPIR-V test. Maybe SPIR-V GLSL.std.450 assembly? No shader-db or fossil-db changes on any Intel platform. Acked-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Ian Romanick	363efc2823	nir: Make some notes about fsign versus NaN This commit only documents the current behavior, even if that behavior is not the behavior preferred by the relevant specs. In SPIR-V, there are two flavors of the sign instruction, and each lives in an extended instruction set. The GLSL.std.450 FSign instruction is defined as: Result is 1.0 if x > 0, 0.0 if x = 0, or -1.0 if x < 0. This also matches the GLSL 4.60 definition. However, the OpenCL.ExtendedInstructionSet.100 sign instruction is defined as: Returns 1.0 if x > 0, -0.0 if x = -0.0, +0.0 if x = +0.0, or -1.0 if x < 0. Returns 0.0 if x is a NaN. There are two differences. Each treats -0.0 differently, and each also treats NaN differently. Specifically, GLSL.std.450 FSign does not define any specific behavior for NaN. There has been some discussion in Khronos about the NaN behavior of GLSL.std.450 FSign. As part of that discussion, I did some research into how we treat NaN for nir_op_fsign, and this commit just captures some of those notes. v2: Document the expected behavior of nir_op_fsign more thoroughly. Suggested by Rhys. Note that the current implementation of constant folding does not produce the expected result for NaN. Suggested by Caio. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> [v1] Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6358>	2021-01-05 02:07:09 +00:00
Rhys Perry	24a18b1a4b	nir: scalarize fdot in reverse This will create code that is easier to combine into MADs/FMA when the last component is 1.0. nir_opt_algebraic_late has an optimization to do something similar but it only works for inexact code, if the multiplication-by-1 optimization is done before it and if the backend enables fuse_ffma. fossil-db (Navi): Totals from 85583 (74.64% of 114665) affected shaders: SGPRs: 4556060 -> 4558596 (+0.06%); split: -0.07%, +0.12% VGPRs: 3315060 -> 3312984 (-0.06%); split: -0.23%, +0.17% SpillSGPRs: 13552 -> 13553 (+0.01%) CodeSize: 184962756 -> 184431388 (-0.29%); split: -0.32%, +0.03% MaxWaves: 1208693 -> 1209361 (+0.06%); split: +0.17%, -0.11% Instrs: 35678819 -> 35361617 (-0.89%); split: -0.91%, +0.02% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5631>	2020-11-03 14:56:00 +00:00
Ian Romanick	67956689bb	nir: Rename replicated-result dot-product instructions All these instructions replicate the result of a N-component dot-product to a vec4. Naming them fdot_replicatedN gives the impression that are some sort of abstract dot-product that replicates the result to a vecN. They also deviate from fdph_replicated... which nobody would reasonably consider naming fdot_replicatedh. Naming these opcodes fdotN_replicated more closely matches what they are, and it matches the pattern of fdph_replicated. I believe that the only reason these opcodes were named this way was because it simplified the implementation of the binop_reduce function in nir_opcodes.py. I made some fairly simple changes to that function, and I think the end result is ok. The bulk of the changes come from the sed rename: sed --in-place -e 's/fdot_replicated$[234]$/fdot\1_replicated/g' \ $(grep -r 'fdot_replicated[234]' src/) v2: Use a named parameter to binop_reduce instead of using isinstance(name, str). Suggested by Jason. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5725>	2020-10-22 18:00:19 +00:00
Tony Wasserka	6a9dc75cc2	nir: Fix undefined behavior due to signed integer multiplication overflows Notably this happened when applying constant folding on the intermediate computations generated from nir_lower_idiv. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6728>	2020-10-07 19:50:01 +00:00
Marek Olšák	cdd498bbe8	nir: add new mediump opcodes f2[ui]mp, i2fmp, u2fmp Algebraic optimizations will select them. Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	385b4dbc39	nir: enforce 32-bit src type requirement for f2fmp and i2imp Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Marek Olšák	3d3df8dbff	nir: remove redundant opcode u2ump Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>	2020-09-10 23:35:13 +00:00
Daniel Schürmann	a79dad950b	nir,amd: remove trinary_minmax opcodes These consist of the variations nir_op_{i\|u\|f}{min\|max\|med}3 which are either lowered in the backend (LLVM) anyway or can be recombined by the backend (ACO). Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6421>	2020-08-24 20:56:11 +00:00

1 2 3

148 Commits