Commit Graph

31 Commits

Author SHA1 Message Date
Rob Clark 4c15c53d91 freedreno/ir3: change opt passes
There are more useful nir passes added since initial conversion to nir.
But ir3 was never updated to use them.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark ec8bc54ad2 freedreno/ir3: use peephole select pass
Agressively lowering all if/else to selects in some extreme cases
results in much higher register pressure.  Using peephole select instead
with a modest threshold speeds up alu2 4x!

16 seems like a good limit, low enough to help alu2 but not too low that
it penalizes everything else.  With a bit better scheduling of the
instruction that moves a value into a predicate register, we might be
able to lower this limit a bit more in the future, but since we need 6
cycles from the move to predicate register to predicated branch, that
puts some sort of lower bound on how far we can lower this threshold.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Rob Clark 942341bcd0 freedreno/ir3: don't lower fsat
Instead, if possible fold (sat) flag into src, otherwise use:

  (sat)max.f rD, rS, rS

Signed-off-by: Rob Clark <robdclark@gmail.com>
2018-02-10 14:54:58 -05:00
Timothy Arceri 9a2e085680 nir: add lower_all_io_to_temps flag
This will be used for freedreno and vc4 which require all inputs
and outputs to be copied to temps.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-31 09:14:08 +11:00
Rob Clark 0536737983 freedreno/a5xx: use vertex_id_zero_base
Cmdstream traces from blob make it clear that the blob driver dev's
*think* a5xx has a real (non-zero-based) vtxid.  But reality claims
differently.

Fixes ./bin/gl-3.2-basevertex-vertexid and probably others.

This means draw-indirect is going to need some gymnastics to copy
base-vertex into uniform.  (a4xx probably needs that too.)

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-12-19 15:00:18 -05:00
Ilia Mirkin 86f12e9377 freedreno/ir3: add a pass to lower tg4 to txl, enable gather on a4xx
Unfortunately Adreno A4xx hardware returns incorrect results with the
GATHER4 opcodes. As a result, we have to lower to 4 individual texture
calls (txl since we have to force lod to 0). We achieve this using
offsets, including on cube maps which normally never have offsets.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Rob Clark <robdclark@gmail.com>
2017-11-25 16:56:59 -05:00
Rob Clark 9edfc369c0 freedreno/ir3: image support
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Rob Clark 33f5f63b8f freedreno/ir3: add SSBO get_buffer_size() support
Somehow I overlooked this when adding initial SSBO support.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-11-12 12:28:59 -05:00
Jason Ekstrand 59fb59ad54 nir: Get rid of nir_shader::stage
It's redundant with nir_shader::info::stage.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-20 12:49:17 -07:00
Rob Clark 1059dc9165 freedreno/ir3: add missing nir_opt_copy_prop_vars() pass
Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark c712a637b9 freedreno/ir3: need different compiler options for a5xx
vertex_id_zero_based differs..

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-23 12:26:35 -04:00
Rob Clark fd6ed7b562 freedreno/ir3: resync instr-a3xx.h/disasm-a3xx.c
Sync to the same files from freedreno.git to correct decoding of ldgb/
stgb instructions.

Signed-off-by: Rob Clark <robdclark@gmail.com>
2017-05-04 13:48:06 -04:00
Jason Ekstrand fb181196de nir: Rename convert_to_ssa lower_regs_to_ssa
This matches the naming of nir_lower_vars_to_ssa, the other to-SSA pass.
2016-12-29 16:02:44 -08:00
Kenneth Graunke 2d8a3fa7ea nir: Report progress from nir_lower_phis_to_scalar.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-09-14 12:01:51 -07:00
Kenneth Graunke 32630e211e nir: Report progress from nir_lower_alu_to_scalar.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2016-09-14 12:01:49 -07:00
Rob Clark 1535519e51 freedreno/ir3: do idiv lowering after main opt loop
Give algebraic-opt pass a chance to catch udiv by const power-of-two,
before running lower-idiv pass.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-06-03 16:05:03 -04:00
Rob Clark 3a1bbd6a0a freedreno/ir3: need to lower fmod too
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-20 11:13:50 -04:00
Rob Clark f8840f471d freedreno/ir3: lower fdiv
Not sure how we didn't hit this already, but since we want fdiv
converted into mul + rcp, we should set this.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-15 17:25:48 -04:00
Rob Clark 784086f3c1 freedreno/ir3: add support for NIR as preferred IR
For now under debug flag, since only suitable for debugging/testing.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-05-15 17:25:47 -04:00
Jason Ekstrand 1b72c31e1f nir/algebraic: Separate ffma lowering from fusing
The i965 driver has its own pass for fusing mul+add combinations that's
much smarter than what nir_opt_algebraic can do so we don't want to get the
nir_opt_algebraic one just because we didn't set lower_ffma.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-11 11:44:35 -07:00
Samuel Iglesias Gonsálvez d00a239b28 freedreno/ir3: lower lrp when operating with double operands
Lower lrp when operating with double operands because float version of
lrp is also lowered.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-05-10 11:25:01 +02:00
Rob Clark e04db879f8 freedreno/ir3: handle color clamp variant ourselves
Now that there is a pass to do this in NIR, lets just use that and
manage the variants ourself, rather than letting state-tracker do it.
This way, mesa/st will precompile shaders without requiring
ST_DEBUG=precompile (which requires a debug build).

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-30 14:56:19 -04:00
Samuel Iglesias Gonsálvez 443600d51e nir: rename lower_flrp to lower_flrp32
A later patch will add lower_flrp64 option to NIR.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2016-04-28 12:01:40 +02:00
Rob Clark 4610e5ef28 freedreno/ir3: fix sin/cos
We seem to need range reduction to get sane results.  Fixes glmark2
jellyfish bench, and a whole bunch of
dEQP-GLES3.functional.shaders.builtin_functions.precision.{sin,cos,tan}.*

v2: squashed in android build fixes from Rob Herring

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-25 16:16:21 -04:00
Rob Clark a148300b13 Revert "freedreno/a4xx: lower srgb in shader for astc textures"
Better workaround in the following patch.

This reverts commit 899bd63ace.
2016-04-24 13:40:57 -04:00
Rob Clark 899bd63ace freedreno/a4xx: lower srgb in shader for astc textures
This *seems* like a hw bug, and maybe only applies to certain a4xx
variants/revisions.  But setting the SRGB bit in sampler view state
(texconst0) causes invalid alpha for ASTC textures.  Work around this
by doing the srgb->linear conversion in the shader instead.

This fixes 392 dEQP tests: dEQP-GLES3.functional.texture.*astc*srgb*

(The remaining fails seem to be a bug w/ ASTC + linear filtering, also
possibly a420.0 specific.)

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-04-19 17:14:04 -04:00
Jason Ekstrand b63a98b121 nir/dead_variables: Configurably work with any variable mode
The old version of the pass only worked on globals and locals and always
left inputs, outputs, uniforms, etc. alone.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2016-04-13 15:45:10 -07:00
Jason Ekstrand a9e6213edd nir/lower_system_values: Add support for several computed values
Reviewed-by: Rob Clark <robdclark@gmail.com>
2016-04-11 13:53:03 -07:00
Rob Clark e73ac84b93 freedreno/ir3: lower extract_byte/word
The following commits broke things by starting to feed us unhandled
extract_u16/extract_u8 opcodes:

commit 905ff86198
Author:     Matt Turner <mattst88@gmail.com>
AuthorDate: Wed Feb 3 14:28:31 2016 -0800
Commit:     Matt Turner <mattst88@gmail.com>
CommitDate: Fri Mar 4 11:52:34 2016 -0800

    nir: Recognize open-coded extract_u16.

commit 76289fbfa8
Author:     Matt Turner <mattst88@gmail.com>
AuthorDate: Thu Jan 21 09:09:48 2016 -0800
Commit:     Matt Turner <mattst88@gmail.com>
CommitDate: Fri Mar 4 11:52:34 2016 -0800

    nir: Recognize open-coded extract_u8.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-03-13 14:10:57 -04:00
Rob Clark 3684e899ea freedreno/ir3: use NIR_PASS helper macros
Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-01-03 09:11:27 -05:00
Rob Clark 74135f804a freedreno/ir3: refactor NIR IR handling
Immediately convert into NIR and do an initial key-agnostic lowering/
optimization pass.  This should let us share most of the per-variant
transformations between each variant, and hopefully minimize the draw-
time variant creation part of the compilation process.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2016-01-03 09:11:27 -05:00