I can't imagine any reasonable optimization which could break this, but
since it's lowered from an integer instructions, we shouldn't do anything
which could change the result.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10081>
Roundtrip to a larger float and divide there. The extra details for
mod/rem are handled directly in integer space to simplify verification
of rounding details. The one issue is that the mantissa might be
rounded down which will cause issues; adding 1 unconditionally (proposed
by Jonathan Marek) fixes this. The lowerings here were tested
exhaustively on all pairs of 16-bit integers.
v2: Update idiv lowering per Rhys Perry's comment.
v3: Rewrite lowerings.
v4: Remove useless ftrunc, fix 8-bit issue, simplify code.
v5: Remove useless ffloor
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Tested-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Tested-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8339>
Helps deduplicate some code between the two lowering paths. In
particular, it ports the missing 32-bit? check to the precise pass. This
does not change anything immediately: drivers depending on this to lower
16-bit did not work before due to type mismatches and will not work now
since it'll refuse to lower. But that means sub-32-bit idiv can be
lowered more efficiently in an algebraic pass.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8339>
This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses()
with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites
all the users as needed.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
This change fixes off-by-one results in corner cases such as
0xffffffff / 0x11111111. For details refer to LLVM bug 46212.
Fixes: 8b98d0954e ('nir/lower_idiv: add new llvm-based path')
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6281>
v2: make variable names snake_case
v2: minor cleanups in emit_udiv()
v2: fix Panfrost build failure
v3: use an enum instead of a boolean flag in nir_lower_idiv()'s signature
v4: remove nir_op_urcp
v5: drop nv50 path
v5: rebase
v6: add back nv50 path
v6: add comment for nir_lower_idiv_path enum
v7: rename _nv50/_llvm to _fast/_precise
v8: fix etnaviv build failure
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
We don't have calculate final quotient in order to calculate unsigned
modulo result. Once we are done with error correction we have partial
result which can be used to find out modulo operation result
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tested on Gen > 9.
v2: 1) Fix lowering
2) Keep a consistent i/u order (Matt Turner)
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The previous code was creating a boolean by doing an arithmetic right-
shift by 31 which produces a boolean which is true if the argument is
negative. This is the same as the expression r < 0 which is much
simpler and doesn't depend on NIR's representation of booleans.
Reviewed-by: Eric Anholt <eric@anholt.net>
Instead of a single i2b and b2i, we now have i2b32 and b2iN where N is
one if 8, 16, 32, or 64. This leads to having a few more opcodes but
now everything is consistent and booleans aren't a weird special case
anymore.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
The NIR story on conversion opcodes is a mess. We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random. This commit re-organizes things and makes them all
consistent:
- All non-bool conversion opcodes now have the explicit size in the
destination and are named <src_type>2<dst_type><size>.
- Integer <-> integer conversion opcodes now only come in i2i and u2u
forms (i2u and u2i have been removed) since the only difference
between the different integer conversions is whether or not they
sign-extend when up-converting.
- Boolean conversion opcodes all have the explicit size on the bool and
are named <src_type>2<dst_type>.
Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako. This will make adding
int8, int16, and float16 versions much easier when the time comes.
Reviewed-by: Eric Anholt <eric@anholt.net>
With algebraic-opt support for lowering div to shift, the driver would
like to be able to run this pass *after* the main opt-loop, and then
conditionally re-run the opt-loop if this pass actually lowered some-
thing.
Signed-off-by: Rob Clark <robclark@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:
s/nir_foreach_function(\([^,]*\),\s*\([^,]*\))/nir_foreach_function(\2, \1)/
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This matches the "foreach x in container" pattern found in many other
programming languages. Generated by the following regular expression:
s/nir_foreach_instr(\([^,]*\),\s*\([^,]*\))/nir_foreach_instr(\2, \1)/
and similar expressions for nir_foreach_instr_safe etc.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>