Commit Graph

206 Commits

Author SHA1 Message Date
Caio Oliveira 858424bd2e intel/compiler: Use gl_shader_stage_uses_workgroup() helpers
Instead of checking for MESA_SHADER_COMPUTE (and KERNEL).  Where
appropriate, also use gl_shader_stage_is_compute().

This allows most of the workgroup-related lowering to be applied to
Task and Mesh shaders.  These will be added later and "inherit" from
cs_prog_data structure.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13629>
2021-11-03 11:09:48 -07:00
Jason Ekstrand 7b21def9c2 intel/fs: Add support for atomic_fadd
Rework:
- Enable float32 atomic add with LSC (Sagar)
- disassemble new opcode (Caio)

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12566>
2021-09-09 23:34:33 +00:00
Ian Romanick 5ce3bfcdf3 intel/compiler: Lower 8-bit ops to 16-bit in NIR on all platforms
This fixes the Crucible func.shader.shift.int8_t test on Gen8 and Gen9.
See https://gitlab.freedesktop.org/mesa/crucible/-/merge_requests/76.

With the previous optimizations in place, this change seems to improve
the quality of the generated code.  Comparing a couple Vulkan CTS tests
on Skylake had the following results.

dEQP-VK.spirv_assembly.type.vec3.i8.bitwise_xor_frag:
SIMD8 shader: 36 instructions. 1 loops. 3822 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 27 instructions. 1 loops. 2742 cycles. 0:0 spills:fills, 5 sends

dEQP-VK.spirv_assembly.type.vec3.i8.max_frag:
SIMD8 shader: 39 instructions. 1 loops. 3922 cycles. 0:0 spills:fills, 5 sends
SIMD8 shader: 37 instructions. 1 loops. 3682 cycles. 0:0 spills:fills, 5 sends

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Ian Romanick f9665040f1 intel/compiler: Document and assert some aspects of 8-bit integer lowering
In the vec4 compiler, 8-bit types should never exist.

In the scalar compiler, 8-bit types should only ever be able to exist on
Gfx ver 8 and 9.

Some instructions are handled in non-obvious ways.

Hopefully this will save the next person some time.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9025>
2021-08-18 22:03:37 +00:00
Timothy Arceri a9ed4538ab nir: add indirect loop unrolling to compiler options
This is where it should be rather than having to pass it into the
optimisation pass every time.

It also allows us to call the loop analysis pass without having to
duplicate these options which we will do later in this series.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12064>
2021-08-03 10:54:50 +00:00
Jason Ekstrand 74ec2b12be nir/lower_tex: Rework invalid implicit LOD lowering
Only fragment and some compute shaders support implicit derivatives.
They're totally meaningless without helper invocations and some
understanding of the dispatch pattern.  We've got code to lower
nir_texop_tex in these shader stages to use an explicit derivative of 0
but it was pretty badly broken:

 1. It only handled nir_texop_tex, not nir_texop_txb or nir_texop_lod.

 2. It didn't take min_lod into account

 3. It was conflated with adding a missing LOD parameter to opcodes
    which expect one such as nir_texop_txf.  While not really a bug,
    this does make it way harder to reason about the code.

 4. Unless you set a flag (which most drivers don't), it left the
    opcode nir_texop_tex instead of nir_texop_txl which it should have
    been.

This reworks it to go through roughly the same path as other LOD
lowering only with a constant lod of 0 instead of calling out to
nir_texop_lod.  We also get rid of the lower_tex_without_implicit_lod
flag because most drivers set it and those that don't are probably
subtly broken.  If someone really wants to get nir_texop_tex in their
vertex shaders, they can write a new patch to add the flag back in.

Fixes: e382890e25 "nir: set default lod to texture opcodes that..."
Fixes: d5ac5d6e83 "nir: Add option to lower tex to txl when..."
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11775>
2021-07-23 15:53:57 +00:00
Timothy Arceri 9f3fde6f36 intel/compiler: Use GCM in nir_optimize
There is still some work to do before we can enable GVN.

In these shader-db results, Skylake and older platforms used i965 while
newer platforms used Iris.  I believe this accounts for the difference
in "sends."  The shaders helped for sends are all Synmark shaders.

On Sandybridge, the shaders helped for sends were the same ones hurt for
spills and fills.  These are also all Synmark shaders.

Tiger Lake and Ice Lake had similar results. (Ice Lake shown)
total instructions in shared programs: 19868594 -> 19868452 (<.01%)
instructions in affected programs: 6607 -> 6465 (-2.15%)
helped: 12
HURT: 1
helped stats (abs) min: 12 max: 12 x̄: 12.00 x̃: 12
helped stats (rel) min: 1.94% max: 2.62% x̄: 2.38% x̃: 2.58%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 0.45% max: 0.45% x̄: 0.45% x̃: 0.45%
95% mean confidence interval for instructions value: -13.27 -8.58
95% mean confidence interval for instructions %-change: -2.67% -1.65%
Instructions are helped.

total cycles in shared programs: 962404540 -> 962008224 (-0.04%)
cycles in affected programs: 961274 -> 564958 (-41.23%)
helped: 23
HURT: 1
helped stats (abs) min: 10 max: 32536 x̄: 17438.96 x̃: 23658
helped stats (rel) min: 0.02% max: 80.04% x̄: 42.05% x̃: 51.58%
HURT stats (abs)   min: 4780 max: 4780 x̄: 4780.00 x̃: 4780
HURT stats (rel)   min: 3.26% max: 3.26% x̄: 3.26% x̃: 3.26%
95% mean confidence interval for cycles value: -22989.90 -10036.43
95% mean confidence interval for cycles %-change: -55.01% -25.32%
Cycles are helped.

Skylake and Broadwell had simliar results. (Skylake shown)
total instructions in shared programs: 17996652 -> 17996154 (<.01%)
instructions in affected programs: 96622 -> 96124 (-0.52%)
helped: 85
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 5.86 x̃: 5
helped stats (rel) min: 0.39% max: 2.65% x̄: 0.68% x̃: 0.39%
95% mean confidence interval for instructions value: -6.42 -5.30
95% mean confidence interval for instructions %-change: -0.84% -0.52%
Instructions are helped.

total cycles in shared programs: 939899189 -> 939289732 (-0.06%)
cycles in affected programs: 3719430 -> 3109973 (-16.39%)
helped: 60
HURT: 39
helped stats (abs) min: 18 max: 32444 x̄: 10437.30 x̃: 6940
helped stats (rel) min: 0.08% max: 80.40% x̄: 23.99% x̃: 12.07%
HURT stats (abs)   min: 10 max: 4970 x̄: 430.28 x̃: 323
HURT stats (rel)   min: 0.05% max: 3.41% x̄: 1.55% x̃: 1.60%
95% mean confidence interval for cycles value: -8095.51 -4216.75
95% mean confidence interval for cycles %-change: -18.65% -9.21%
Cycles are helped.

total sends in shared programs: 1026997 -> 1026927 (<.01%)
sends in affected programs: 6090 -> 6020 (-1.15%)
helped: 70
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.15% max: 1.15% x̄: 1.15% x̃: 1.15%
95% mean confidence interval for sends value: -1.00 -1.00
95% mean confidence interval for sends %-change: -1.15% -1.15%
Sends are helped.

Haswell and Ivy Bridge had similar results. (Haswell shown)
total instructions in shared programs: 16040891 -> 16040252 (<.01%)
instructions in affected programs: 109132 -> 108493 (-0.59%)
helped: 87
HURT: 0
helped stats (abs) min: 1 max: 12 x̄: 7.34 x̃: 7
helped stats (rel) min: 0.05% max: 2.61% x̄: 0.75% x̃: 0.51%
95% mean confidence interval for instructions value: -7.84 -6.85
95% mean confidence interval for instructions %-change: -0.90% -0.61%
Instructions are helped.

total cycles in shared programs: 968579567 -> 967867117 (-0.07%)
cycles in affected programs: 30688439 -> 29975989 (-2.32%)
helped: 241
HURT: 62
helped stats (abs) min: 4 max: 31929 x̄: 3901.22 x̃: 2282
helped stats (rel) min: 0.04% max: 79.63% x̄: 12.70% x̃: 4.44%
HURT stats (abs)   min: 4 max: 8230 x̄: 3673.27 x̃: 637
HURT stats (rel)   min: 0.01% max: 63.87% x̄: 24.54% x̃: 3.56%
95% mean confidence interval for cycles value: -3100.23 -1602.41
95% mean confidence interval for cycles %-change: -7.94% -2.22%
Cycles are helped.

total sends in shared programs: 935025 -> 934955 (<.01%)
sends in affected programs: 6090 -> 6020 (-1.15%)
helped: 70
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.15% max: 1.15% x̄: 1.15% x̃: 1.15%
95% mean confidence interval for sends value: -1.00 -1.00
95% mean confidence interval for sends %-change: -1.15% -1.15%
Sends are helped.

LOST:   1
GAINED: 0

Sandy Bridge
total instructions in shared programs: 11785330 -> 11786504 (<.01%)
instructions in affected programs: 53462 -> 54636 (2.20%)
helped: 16
HURT: 36
helped stats (abs) min: 1 max: 17 x̄: 10.06 x̃: 9
helped stats (rel) min: 0.47% max: 3.29% x̄: 2.03% x̃: 1.90%
HURT stats (abs)   min: 5 max: 38 x̄: 37.08 x̃: 38
HURT stats (rel)   min: 1.77% max: 2.98% x̄: 2.94% x̃: 2.98%
95% mean confidence interval for instructions value: 16.26 28.90
95% mean confidence interval for instructions %-change: 0.75% 2.08%
Instructions are HURT.

total cycles in shared programs: 498009911 -> 497378300 (-0.13%)
cycles in affected programs: 6848277 -> 6216666 (-9.22%)
helped: 108
HURT: 28
helped stats (abs) min: 4 max: 25394 x̄: 6037.42 x̃: 766
helped stats (rel) min: 0.02% max: 60.58% x̄: 11.60% x̃: 4.83%
HURT stats (abs)   min: 96 max: 6834 x̄: 729.64 x̃: 742
HURT stats (rel)   min: 0.17% max: 16.23% x̄: 1.57% x̃: 1.55%
95% mean confidence interval for cycles value: -5907.99 -3380.40
95% mean confidence interval for cycles %-change: -11.67% -6.11%
Cycles are helped.

total spills in shared programs: 2316 -> 2526 (9.07%)
spills in affected programs: 280 -> 490 (75.00%)
helped: 0
HURT: 35

total fills in shared programs: 1540 -> 1750 (13.64%)
fills in affected programs: 280 -> 490 (75.00%)
helped: 0
HURT: 35

total sends in shared programs: 642985 -> 642950 (<.01%)
sends in affected programs: 3045 -> 3010 (-1.15%)
helped: 35
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.15% max: 1.15% x̄: 1.15% x̃: 1.15%
95% mean confidence interval for sends value: -1.00 -1.00
95% mean confidence interval for sends %-change: -1.15% -1.15%
Sends are helped.

LOST:   1
GAINED: 0

Iron Lake and GM45 had similar results. (Iron Lake shown)
total cycles in shared programs: 239442382 -> 239429400 (<.01%)
cycles in affected programs: 20816 -> 7834 (-62.37%)
helped: 2
HURT: 0

In Fossil-db, all of the shaders hurt for spill and fills are compute
shaders from Shadow of the Tomb Raider.  Two shaders were helped for
sends, and these are also from SotTR.

All of the shaders helped for loops were from Geekbench5.  These all
went from 3 loops to 2.

Tiger Lake
Instructions in all programs: 160852396 -> 160855303 (+0.0%)
SENDs in all programs: 6878559 -> 6878559 (+0.0%)
Loops in all programs: 38350 -> 38305 (-0.1%)
Cycles in all programs: 7369162339 -> 7344236445 (-0.3%)
Spills in all programs: 193762 -> 193876 (+0.1%)
Fills in all programs: 306417 -> 306600 (+0.1%)

Ice Lake
Instructions in all programs: 144592523 -> 144593946 (+0.0%)
SENDs in all programs: 6930697 -> 6930697 (+0.0%)
Loops in all programs: 38344 -> 38299 (-0.1%)
Cycles in all programs: 8732456458 -> 8707823383 (-0.3%)
Spills in all programs: 216692 -> 216806 (+0.1%)
Fills in all programs: 334089 -> 334272 (+0.1%)

Skylake
Instructions in all programs: 135618746 -> 135619971 (+0.0%)
SENDs in all programs: 6896728 -> 6896724 (-0.0%)
Loops in all programs: 38343 -> 38298 (-0.1%)
Cycles in all programs: 8391957144 -> 8368935657 (-0.3%)
Spills in all programs: 194741 -> 194879 (+0.1%)
Fills in all programs: 301048 -> 301255 (+0.1%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/597>
2021-07-21 14:24:00 +00:00
Timothy Arceri c742a99fb6 intel/compiler: call nir_opt_dead_cf() after we have finished all opts
This will avoid a regression with the following patch.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/597>
2021-07-21 14:24:00 +00:00
Connor Abbott e4e79de2a4 nir/subgroups: Support > 1 ballot components
Qualcomm has a mode with a subgroup size of 128, so just emitting larger
integer operations and then lowering them later isn't an option. This
makes the pass able to handle the lowering itself, so that we don't have
to go down to 64-thread wavefronts when ballots are used.

(The GLSL and legacy SPIR-V extensions only support a maximum of 64
threads, but I guess we'll cross that bridge when we come to it...)

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>
2021-07-08 16:02:41 +00:00
Yevhenii Kolesnikov 974c58b317 intel: fix leaking memory on shader creation
ralloc_adopt takes care of all the shader's children, but shader itsel ends up
orphaned and never gets free'd.

Fixes: ef5bce9253 ("intel: Drop the last uses of a mem_ctx in nir_builder_init_simple_shader().")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4951

Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11651>
2021-06-30 19:34:56 +03:00
Dave Airlie 8da92b5c0a intel/compiler: add flag to indicate edge flags vertex input is last
965 and the mesa st disagree on how vertex elements are ordered when
edgeflags are involved. 965 wants them in gl_vert_attrib order,
but gallium supplies the edgeflag as the last vertex element regardless.

This adds a flag which is enabled for gen4/5 to denote that the
edgeflag is at the end. When we reap 965 later we can resolve this
better.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11146>
2021-06-14 06:05:18 +10:00
Connor Abbott a40714abf7 nir/lower_phis_to_scalar: Add "lower_all" option
We don't want to have to deal with vector phis in freedreno, because
vectors are always split/unsplit around vectorized instructions anyways,
and the stated reason for not scalarising them (it hurting coalescing)
won't apply to us because we won't be using nir_from_ssa. Add this
option so that we don't have to do the equivalent thing while
translating from NIR.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10809>
2021-05-17 09:59:45 +00:00
Marcin Ślusarz 3340d5ee02 intel: simplify is_haswell checks, part 1
Generated with:

files=`git grep is_haswell | cut -d: -f1 | sort | uniq`
for file in $files; do
        cat $file | \
                sed "s/devinfo->ver <= 7 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" | \
                sed "s/devinfo->ver >= 8 || devinfo->is_haswell/devinfo->verx10 >= 75/g" | \
                sed "s/devinfo->is_haswell || devinfo->ver >= 8/devinfo->verx10 >= 75/g" | \
                sed "s/devinfo.is_haswell || devinfo.ver >= 8/devinfo.verx10 >= 75/g" | \
                sed "s/devinfo->ver > 7 || devinfo->is_haswell/devinfo->verx10 >= 75/g" | \
                sed "s/devinfo->ver == 7 && !devinfo->is_haswell/devinfo->verx10 == 70/g" | \
                sed "s/devinfo.ver == 7 && !devinfo.is_haswell/devinfo.verx10 == 70/g" | \
                sed "s/devinfo->ver < 8 && !devinfo->is_haswell/devinfo->verx10 <= 70/g" | \
                sed "s/device->info.ver == 7 && !device->info.is_haswell/device->info.verx10 == 70/g" \
                > tmpXXX
        mv tmpXXX $file
done

Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10810>
2021-05-17 09:46:45 +00:00
Jason Ekstrand 05a37e2422 intel/nir: Set lower txs with non-zero LOD
There's a recently discovered HW bug affecting hardware at least as far
back as Skylake where, if the LOD is out-of-bounds for any SIMD lane,
then garbage may be returned in all SIMD lanes.  The easy solution is to
set lower_txs_lod so that we always have a constant LOD of 0 which we
know a priori is always in-bounds.  Fortunately, not many shaders
actually use textureSize() with LOD.

Shader-db results on Ice Lake:

    total instructions in shared programs: 19948537 -> 19948564 (<.01%)
    instructions in affected programs: 3859 -> 3886 (0.70%)
    helped: 0
    HURT: 7

One of the shaders is in Civilization: Beyond Earth, and the rest are
all in Civilization VI.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10538>
2021-05-04 00:02:43 +00:00
Anuj Phogat 4c535cbf99 intel: Fix alignment and line wrapping due to gen_device renaming
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Anuj Phogat 61e8636557 intel: Rename gen_device prefix to intel_device
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen_device" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_device/intel_device/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Anuj Phogat 926d343acf intel: Rename files with gen_debug prefix
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
find $SEARCH_PATH -type f -name "*gen_debug.*[cph]" -exec sh -c 'f="{}"; mv -- "$f" "${f/gen_debug/intel_debug}"' \;
grep -E "gen_debug" -rIl $SEARCH_PATH | xargs sed -ie "s/gen_debug\./intel_debug\./g"
grep -E "GEN_DEBUG" -rIl $SEARCH_PATH | xargs sed -ie "s/GEN_DEBUG_H/INTEL_DEBUG_H/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>
2021-04-20 20:06:33 +00:00
Jordan Justen 635ed58e52 intel/compiler: Lower txd for 3D samplers on XeHP.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Francisco Jerez 262b647b25 intel/compiler: Lower integer division on XeHP.
It has been removed from the hardware.

[jordan.l.justen@intel.com: Move to brw_postprocess_nir]

v2: Switch to nir_lower_idiv_precise (Rhys).
v3: Fix for interface changes of nir_lower_idiv.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10000>
2021-04-16 08:27:35 +00:00
Iván Briano 8328989130 intel, anv: propagate robustness setting to nir_opt_load_store_vectorize
Closes #4309
Fixes dEQP-VK-robustness.robustness2.*.readonly.*

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10147>
2021-04-13 13:30:09 -07:00
Lionel Landwerlin 33b2daab1a intel/compiler: lower bit sizes in NIR postprocessing
It appears that between preprocess & postprocess some descriptor
lowering introduces 8bit types in the shader, so run the lower bit
size again to make sure we don't have any unsupported types in our
shader.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: e06144a818 ("anv: Use 64bit_global_32bit_offset for SSBOs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4478
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9705>
2021-04-06 23:21:30 +03:00
Anuj Phogat 1d296484b4 intel: Rename Genx keyword to Gfxx
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "Gen[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/Gen\([[:digit:]]\+\)/Gfx\1/g"

Exclude changes in src/intel/perf/oa-*.xml:
find src/intel/perf -type f \( -name "*.xml" \) | xargs sed -ie "s/Gfx/Gen/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:07 +00:00
Anuj Phogat b75f095bc7 intel: Rename genx keyword to gfxx in source files
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "gen[[:digit:]]+" -rIl $SEARCH_PATH | xargs sed -ie "s/gen\([[:digit:]]\+\)/gfx\1/g"

Exclude pack.h and xml changes in this patch:
grep -E "gfx[[:digit:]]+_pack\.h" -rIl $SEARCH_PATH | xargs sed -ie "s/gfx\([[:digit:]]\+_pack\.h\)/gen\1/g"
grep -E "gfx[[:digit:]]+\.xml" -rIl $SEARCH_PATH | xargs sed -ie "s/gfx\([[:digit:]]\+\.xml\)/gen\1/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:07 +00:00
Anuj Phogat abe9a71a09 intel: Rename gen field in gen_device_info struct to ver
Commands used to do the changes:
export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965"
grep -E "info\)*(.|->)gen" -rIl $SEARCH_PATH | xargs sed -ie "s/info\()*\)\(\.\|->\)gen/info\1\2ver/g"

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>
2021-04-02 18:33:07 +00:00
Caio Marcelo de Oliveira Filho 244d2daa00 intel/compiler: Make brw_postprocess_nir take debug_enabled as a parameter
The callers already have this value, and we would like to make it
follow different rules other than stage that might not be visible to
the helper function, so just pass explicitly.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>
2021-03-24 23:18:46 +00:00
Jason Ekstrand 117668b811 nir: Make nir_ssa_def_rewrite_uses take an SSA value
This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses()
with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites
all the users as needed.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
2021-03-08 16:59:55 +00:00
Rhys Perry cbb5ed476c nir/opt_shrink_vectors: add option to skip shrinking image stores
Some games declare the wrong format, so we might want to disable this
optimization in that case.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: e4d75c22 ("nir/opt_shrink_vectors: shrink image stores using the format")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9229>
2021-03-03 14:18:37 +00:00
Jason Ekstrand d670afa27a intel/nir: Lower 8-bit phis on Gen11+
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8872>
2021-02-16 16:36:31 +00:00
Caio Marcelo de Oliveira Filho 9f3d5e99ea compiler: Use util/bitset.h for system_values_read
It is currently a bitset on top of a uint64_t but there are already
more than 64 values.  Change to use BITSET to cover all the
SYSTEM_VALUE_MAX bits.

Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Acked-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8585>
2021-01-26 20:20:47 +00:00
Connor Abbott 68969cbcb7 brw/vec4: Don't convert tex dest type to glsl_type
We were using nir_tex_instr::dest_type to a glsl_type, then passing it
to emit_texture(), only to just check the number of components. Just
pass the number of components directly. This lets us delete
brw_glsl_base_type_for_nir_type, which was asserting with
nir_texop_all_samples_equal because it didn't handle bool32.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7989>
2021-01-25 11:21:42 +01:00
Erico Nunes faaba0d6af nir/lower_vec_to_movs: don't vectorize unsupports ops
If the instruction being coalesced would be vectorized but the target
doesn't support vectorizing that op, skip coalescing.
Reuse the callbacks from alu_to_scalar to describe which ops should not
be vectorized.

Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6506>
2021-01-11 13:13:30 +00:00
Rhys Perry f199b7188b nir/load_store_vectorize: add data as callback args
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry 00c8bec47b nir: add nir_load_store_vectorize_options
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Christian Gmeiner c5a9270109 intel/compiler: use intrinsic builders
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8295>
2021-01-06 14:34:41 +00:00
Kenneth Graunke 97ebb896af intel/compiler: Do interpolateAtOffset coordinate scaling in NIR
In our source languages, interpolateAtOffset() takes a floating point
offset in the range [-0.5, +0.5].  However, the hardware takes integer
valued offsets in the range [-8, 7], in units of 1/16th of a pixel.

So, we need to multiply and clamp the coordinates.  We were doing this
in the FS backend, but with the advent of IBC, I'd like to avoid doing
it twice.  This patch instead moves the lowering to NIR so we can reuse
it across both backends.

v2: Use nir_shader_instructions_pass (suggested by Eric Anholt).

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6193>
2020-11-18 23:26:53 +00:00
Kenneth Graunke 2009258796 intel/compiler: Fix passthrough TCS regressions from program rename
In commit eda3e4e055, Eric added names
to various programs.  In that patch, he also renamed our passthrough
TCS shader from "passthrough" to "passthrough TCS".  The passthrough
TCS directly supplies the VUE headers rather than doing the whole
"patch parameters are in backwards order" reswizzling dance.

We failed to detect this and started trying to supply vec4s starting
at component 3, leading to a stack smash on an array of 7 sources,
not to mention the values were being put in the wrong place.

Easy fix: update the code for the new name.

Fixes: eda3e4e055 ("nir/builder: Add a name format arg to nir_builder_init_simple_shader().")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3777
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7564>
2020-11-11 21:19:40 -08:00
Eric Anholt eda3e4e055 nir/builder: Add a name format arg to nir_builder_init_simple_shader().
This cleans up a bunch of gross sprintfs and keeps the caller from needing
to remember to ralloc_strdup.  I added a couple of '"%s", name ? name :
""' to radv where I didn't fully trace through whether a non-null name was
being passed in.

I also took the liberty of adding a basic name to a few shaders (pan_blit,
unit tests)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>
2020-11-11 08:50:29 -08:00
Eric Anholt 5f992802f5 nir/builder: Drop the mem_ctx arg from nir_builder_init_simple_shader().
This looks a lot more simple now!

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>
2020-11-11 08:50:29 -08:00
Eric Anholt ef5bce9253 intel: Drop the last uses of a mem_ctx in nir_builder_init_simple_shader().
These two consumers were the only ones out of the ~65 calls to
init_simple_shader, so there's a pretty clear consensus on how to allocate
simple shaders.  I suspect that actually these would be just fine with
b.shader being the mem_ctx, but that would take a bit more rework.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>
2020-11-11 08:50:27 -08:00
Eric Anholt 4e9328e3b6 nir_builder: Return a new builder from nir_builder_init_simple_shader().
It's a little inline function, so we can just RAII it for better
ergonomics.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7323>
2020-11-11 08:49:49 -08:00
Jason Ekstrand 68092df8d8 intel/nir: Lower 8-bit ops to 16-bit in NIR on Gen11+
Intel hardware supports 8-bit arithmetic but it's tricky and annoying:

  - Byte operations don't actually execute with a byte type.  The
    execution type for byte operations is actually word.  (I don't know
    if this has implications for the HW implementation.  Probably?)

  - Destinations are required to be strided out to at least the
    execution type size.  This means that B-type operations always have
    a stride of at least 2.  This means wreaks havoc on the back-end in
    multiple ways.

  - Thanks to the strided destination, we don't actually save register
    space by storing things in bytes.  We could, in theory, interleave
    two byte values into a single 2B-strided register but that's both a
    pain for RA and would lead to piles of false dependencies pre-Gen12
    and on Gen12+, we'd need some significant improvements to the SWSB
    pass.

  - Also thanks to the strided destination, all byte writes are treated
    as partial writes by the back-end and we don't know how to copy-prop
    them.

  - On Gen11, they added a new hardware restriction that byte types
    aren't allowed in the 2nd and 3rd sources of instructions.  This
    means that we have to emit B->W conversions all over to resolve
    things.  If we emit said conversions in NIR, instead, there's a
    chance NIR can get rid of some of them for us.

We can get rid of a lot of this pain by just asking NIR to get rid of
8-bit arithmetic for us.  It may lead to a few more conversions in some
cases but having back-end copy-prop actually work is probably a bigger
bonus.  There is still a bit we have to handle in the back-end.  In
particular, basic MOVs and conversions because 8-bit load/store ops
still require 8-bit types.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
2020-11-09 18:58:51 +00:00
Jason Ekstrand b98f0d3d7c intel/nir: Lower 8-bit scan/reduce ops to 16-bit
We can't really support these directly on any platform.  May as well let
NIR lower them.  The NIR lowering is potentially one more instruction
for scan/reduce ops thanks to not being able to do the B->W conversion
as part of SEL_EXEC.  For imax/imin exclusive scan, it's yet another
instruction thanks to the extra imax/imin NIR has to insert to deal with
the fact that the first live channel will contain the identity value
which, when signed, will cast wrong.  However, it does let us drop some
complexity from our back-end so it's probably worth it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
2020-11-09 18:58:51 +00:00
Jason Ekstrand 3ad2d85995 intel/nir: Refactor lower_bit_size_callback
We want to use it for more than just ALU.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
2020-11-09 18:58:51 +00:00
Jason Ekstrand 2c4b47184d nir/lower_bit_size: Pass a nir_instr to the callback
This way we can start supporting more than just ALU ops.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7482>
2020-11-09 18:58:51 +00:00
Rhys Perry 8850a63161 radv/aco,nir/lower_subgroups: don't lower elect
ACO can implement this better.

fossil-db (Navi):
Totals from 33 (0.02% of 135946) affected shaders:
SGPRs: 1736 -> 1744 (+0.46%)
VGPRs: 1680 -> 1656 (-1.43%)
CodeSize: 246160 -> 245916 (-0.10%); split: -0.14%, +0.04%
MaxWaves: 449 -> 461 (+2.67%)
Instrs: 48301 -> 48266 (-0.07%); split: -0.12%, +0.05%
Cycles: 469740 -> 469240 (-0.11%); split: -0.18%, +0.08%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>
2020-10-13 12:47:20 +00:00
Timur Kristóf 2be99012e9 nir: Add ability to count emitted GS primitives.
Add an option to nir_lower_gs_intrinsics which tells it to track
the number of emitted primitives, not just vertices. Additionally,
also make it per-stream.

Also rename the set_vertex_count intrinsic to
set_vertex_and_primitive_count.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>
2020-10-09 15:26:14 +02:00
Eric Anholt 618556a8cb nir: Drop the high_offset argument to the load_store_vectorizer filter.
Nothing uses it, and it's not clear to me what it provides over
alignment/num_components/bit_size.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>
2020-09-30 19:53:43 +00:00
Eric Anholt 5f757bb95c nir: Make the load_store_vectorizer provide align_mul + align_offset.
It was passing an encoding of the two that wasn't good for ensuring "Don't
combine loads that would make us straddle a vec4 boundary" for
nir_lower_ubo_vec4.

Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6612>
2020-09-30 19:53:43 +00:00
Jason Ekstrand 3bd7c3c9db intel/nir: Call validate_ssa_dominance at both ends of the NIR compile
This invokes it before we go into the optimization/lowering pass and
then right before we go out of SSA.

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5288>
2020-09-08 19:44:01 +00:00
Marek Olšák ac55b1a9a6 nir: get ffma support from NIR options for nir_lower_flrp
This also fixes the inverted last parameter of nir_lower_flrp in most drivers.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6599>
2020-09-04 17:06:22 +00:00