Commit Graph

96 Commits

Author SHA1 Message Date
Daniel Schürmann 3f9e986d33 aco: add missing Licenses and remove Authors from files
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>
2021-07-12 12:09:31 +00:00
Daniel Schürmann 59fdaa1985 aco: reorder and cleanup #includes
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11271>
2021-07-12 12:09:31 +00:00
Rhys Perry 2f94353735 aco: add p_extract/p_insert
These will let us make the SDWA optimizer much simpler than if we were to
recognize combinations of shift/and/bfe.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>
2021-06-08 08:57:42 +00:00
Michel Dänzer 2928c21eb7 Convert most remaining free-form fall-through comments to FALLTHROUGH
One exception is src/amd/addrlib/, for which -Wimplicit-fallthrough is
explicitly disabled.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10220>
2021-04-15 16:01:22 +00:00
Daniel Schürmann 13e4fed01f aco: lower p_spill with constants correctly
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9196>
2021-04-13 18:40:57 +00:00
Rhys Perry 0af7ff49fd aco: lower p_constaddr into separate instructions earlier
This allows them to be scheduled properly and simplifies the assembler a
little.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>
2021-03-11 16:31:19 +00:00
Rhys Perry e115b01948 aco: return references in instruction cast methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:33 +00:00
Rhys Perry 1d245cd18b aco: use format-check methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Rhys Perry 70dbcfa1c9 aco: use instruction cast methods
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Rhys Perry 441ead5fb3 aco: remove Format::{VOP3A,VOP3B}
These are really the same as Format::VOP3.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8595>
2021-01-22 14:12:32 +00:00
Daniel Schürmann c0cec3a29b aco: generalize subdword constant copy lowering
This will allow to propagate and emit sub-register constants
on all hardware generations.

Also fixes GFX8 constant emission to not use SDWA.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8260>
2021-01-21 11:05:36 +00:00
Daniel Schürmann 288032a873 aco: remove divergent branches which only jump over very few instructions
Totals from 18436 (13.23% of 139391) affected shaders (NAVI10):
CodeSize: 138428504 -> 138172588 (-0.18%)
Instrs: 26605127 -> 26541176 (-0.24%)
Cycles: 1624994088 -> 1622461620 (-0.16%)
VMEM: 3689892 -> 3689102 (-0.02%)
SMEM: 1131767 -> 1131761 (-0.00%)
Branches: 851796 -> 787852 (-7.51%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7814>
2021-01-13 18:04:28 +00:00
Rhys Perry 7610630124 aco: coalesce constant copies
fossil-db (Navi):
Totals from 20108 (14.49% of 138791) affected shaders:
CodeSize: 117835376 -> 117830512 (-0.00%)
Instrs: 22813722 -> 22733245 (-0.35%)
Cycles: 1009135584 -> 1008543628 (-0.06%)
VMEM: 5401668 -> 5391247 (-0.19%)
SMEM: 1286824 -> 1283663 (-0.25%)
Copies: 1742154 -> 1661686 (-4.62%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798>
2020-12-04 14:44:49 +00:00
Rhys Perry f53d4e5f60 aco: use v_lshrrev_b64 for 64-bit VGPR copies on GFX10+
This isn't worth it on GFX9-, but the proprietary compiler uses it on
GFX10.

fossil-db (Navi):
Totals from 23825 (17.17% of 138791) affected shaders:
CodeSize: 130623632 -> 130623800 (+0.00%); split: -0.00%, +0.00%
Instrs: 25185559 -> 25108597 (-0.31%)
Cycles: 709864740 -> 708910860 (-0.13%)
VMEM: 7205343 -> 7168839 (-0.51%); split: +0.00%, -0.51%
SMEM: 1584946 -> 1575183 (-0.62%)
Copies: 2043134 -> 1966230 (-3.76%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798>
2020-12-04 14:44:48 +00:00
Rhys Perry 8c02a8e2d2 aco: add get_const/is_constant_representable helpers
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798>
2020-12-04 14:44:48 +00:00
Rhys Perry 2c40846ab6 aco: don't assume src=lower when splitting self-intersecting copies
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 09c584caeb ("aco: split self-intersecting copies instead of swapping")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7798>
2020-12-04 14:44:48 +00:00
Rhys Perry 756bb29391 aco: create vgpr constant copies using v_bfrev_b32
Looks like this worked once but broke at some point.

fossil-db (Vega):
Totals from 577 (0.42% of 137413) affected shaders:
CodeSize: 3822052 -> 3818720 (-0.09%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5390>
2020-11-20 19:50:31 +00:00
Tony Wasserka 2bb8874320 aco: Fix -Wshadow warnings
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7430>
2020-11-20 09:29:19 +00:00
Rhys Perry 867323379e aco: don't use SMEM for SSBO stores
fossil-db (Navi):
Totals from 70 (0.05% of 138791) affected shaders:
SGPRs: 2324 -> 2097 (-9.77%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 157872 -> 154836 (-1.92%); split: -1.93%, +0.01%
MaxWaves: 1288 -> 1260 (-2.17%)
Instrs: 29730 -> 29108 (-2.09%); split: -2.13%, +0.04%
Cycles: 394944 -> 391280 (-0.93%); split: -0.94%, +0.01%
VMEM: 5288 -> 5695 (+7.70%); split: +11.97%, -4.27%
SMEM: 2680 -> 2444 (-8.81%); split: +1.34%, -10.15%
VClause: 291 -> 502 (+72.51%)
SClause: 1176 -> 918 (-21.94%)
Copies: 3549 -> 3517 (-0.90%); split: -1.80%, +0.90%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1675 -> 1491 (-10.99%)
PreVGPRs: 1101 -> 1223 (+11.08%)

Totals from 70 (0.05% of 139517) affected shaders (RAVEN):
SGPRs: 2368 -> 2121 (-10.43%)
VGPRs: 1344 -> 1480 (+10.12%)
CodeSize: 156664 -> 153252 (-2.18%)
MaxWaves: 636 -> 622 (-2.20%)
Instrs: 29968 -> 29226 (-2.48%)
Cycles: 398284 -> 393492 (-1.20%)
VMEM: 5544 -> 5930 (+6.96%); split: +11.72%, -4.76%
SMEM: 2752 -> 2502 (-9.08%); split: +1.20%, -10.28%
VClause: 292 -> 504 (+72.60%)
SClause: 1236 -> 940 (-23.95%)
Copies: 3907 -> 3852 (-1.41%); split: -2.20%, +0.79%
Branches: 1230 -> 1228 (-0.16%)
PreSGPRs: 1671 -> 1487 (-11.01%)
PreVGPRs: 1102 -> 1225 (+11.16%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6143>
2020-11-16 15:52:22 +00:00
Rhys Perry 70ff262cda aco: use v_mov_b32_sdwa for some 16-bit constants
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Rhys Perry d20a752c0d aco: use Builder::copy more
fossil-db (Navi):
Totals from 6973 (5.07% of 137413) affected shaders:
SGPRs: 381768 -> 381776 (+0.00%)
VGPRs: 306092 -> 306096 (+0.00%); split: -0.00%, +0.00%
CodeSize: 24440844 -> 24421196 (-0.08%); split: -0.09%, +0.01%
MaxWaves: 86581 -> 86583 (+0.00%)
Instrs: 4682161 -> 4679578 (-0.06%); split: -0.06%, +0.00%
Cycles: 68793116 -> 68261648 (-0.77%); split: -0.83%, +0.05%

fossil-db (Polaris):
Totals from 8154 (5.87% of 138881) affected shaders:
VGPRs: 338916 -> 338920 (+0.00%); split: -0.00%, +0.00%
CodeSize: 23540428 -> 23540488 (+0.00%); split: -0.00%, +0.00%
MaxWaves: 49090 -> 49091 (+0.00%)
Instrs: 4576085 -> 4576101 (+0.00%); split: -0.00%, +0.00%
Cycles: 51720704 -> 51720888 (+0.00%); split: -0.00%, +0.00%

Most of the Navi cycle/instruction changes are from 8/16-bit parallel-rdp
shaders. They appear to be improved because the p_create_vector from
lower_subdword_phis() was blocking constant propagation.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Rhys Perry 74e2e9b682 aco: don't use bld.copy() in handle_operands()
No fossil-db changes.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7216>
2020-10-27 15:24:38 +00:00
Daniel Schürmann cb12879401 aco: fix GFX8 16-bit packing
def.physReg() was uninitialized.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Fixes: d96f387e7a ('aco: improve code sequences for 16bit packing')
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7334>
2020-10-27 12:56:14 +01:00
Daniel Schürmann cf083f1d02 aco: use do_pack() for self-intersecting operations.
This improves the code for GFX8+, but is slightly
worse for GFX6_7.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7189>
2020-10-26 12:21:13 +00:00
Daniel Schürmann d96f387e7a aco: improve code sequences for 16bit packing
This includes using alignbyte for GFX6 and GFX7,
and 32-bit instructions for GFX8.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7189>
2020-10-26 12:21:13 +00:00
Daniel Schürmann 40bfb08828 aco: refactor GFX6_7 subdword copy lowering
The new code uses alignbyte which leads
to shorter code and preserves the operand's
registers.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7189>
2020-10-26 12:21:13 +00:00
Tony Wasserka 86c227c10c aco: Use strong typing to model SW<->HW stage mappings
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7094>
2020-10-21 09:49:38 +00:00
Rhys Perry d75d12f507 aco: don't use v_pack_b32_f16 if 16-bit input denormals are flushed
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7111>
2020-10-15 11:33:42 +00:00
Rhys Perry 1a652244e4 aco: implement 16-bit literals
We can copy any value into a 16-bit subregister with a 3 dword
v_pack_b32_f16 on GFX10 or a v_and_b32+v_or_b32 on GFX9.

Because the generated code can depend on the register assignment and to
improve constant propagation, Builder::copy creates a p_create_vector in
the case of sub-dword literals.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7111>
2020-10-15 11:33:42 +00:00
Samuel Pitoiset 20d73a9049 aco: adjust an assertion about the wavesize in emit_gfx10_wave64_bpermute()
This gets rids of one more use of radv_shader_info.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>
2020-10-14 15:09:34 +00:00
Samuel Pitoiset 97afb2a0a9 aco: remove unused radv_shader.h includes
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7061>
2020-10-14 15:09:34 +00:00
Rhys Perry bf77f539ee aco: optimize more uniform reductions/scans
Uniform atomic optimization will create these.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6558>
2020-10-13 12:47:20 +00:00
Rhys Perry 156fd58cda aco: reserve 2 sgprs for each branch
We'll need two sgprs for the possibility of a long jump.

fossil-db (Navi):
Totals from 10197 (7.50% of 135946) affected shaders:
SGPRs: 946268 -> 946468 (+0.02%)
VGPRs: 705884 -> 707956 (+0.29%); split: -0.00%, +0.30%
SpillSGPRs: 31485 -> 36212 (+15.01%); split: -0.04%, +15.05%
CodeSize: 88296484 -> 88384604 (+0.10%); split: -0.01%, +0.11%
MaxWaves: 81379 -> 81171 (-0.26%)
Instrs: 17219111 -> 17231682 (+0.07%); split: -0.03%, +0.10%
Cycles: 1594875900 -> 1596450136 (+0.10%); split: -0.05%, +0.15%
VMEM: 1687263 -> 1689080 (+0.11%); split: +0.14%, -0.03%
SMEM: 657726 -> 660262 (+0.39%); split: +0.61%, -0.22%
VClause: 294806 -> 294638 (-0.06%); split: -0.08%, +0.02%
SClause: 556702 -> 556210 (-0.09%); split: -0.12%, +0.03%
Copies: 1466323 -> 1469349 (+0.21%); split: -0.57%, +0.78%
Branches: 619793 -> 618556 (-0.20%); split: -0.28%, +0.08%
PreSGPRs: 806364 -> 811477 (+0.63%); split: -0.14%, +0.77%
PreVGPRs: 655845 -> 657174 (+0.20%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Cc: 20.2 <mesa-stable>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6212>
2020-08-26 13:26:58 +00:00
Rhys Perry e6366f9094 aco: add framework for unit testing
And add some "tests" to test and document currently unused features of the
framework.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Daniel Schürmann <daniel@schuermann.dev>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3521>
2020-07-30 16:13:08 +00:00
Rhys Perry d1f992f3c2 aco: rework barriers and replace can_reorder
fossil-db (Navi):
Totals from 273 (0.21% of 132058) affected shaders:
CodeSize: 937472 -> 936556 (-0.10%)
Instrs: 158874 -> 158648 (-0.14%)
Cycles: 13563516 -> 13562612 (-0.01%)
VMEM: 85246 -> 85244 (-0.00%)
SMEM: 21407 -> 21310 (-0.45%); split: +0.05%, -0.50%
VClause: 9321 -> 9317 (-0.04%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4905>
2020-07-28 16:56:34 +00:00
Rhys Perry b36950ad2c aco: fix nir_op_f2f16_rtne with non-default rounding modes
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5773>
2020-07-17 16:40:47 +00:00
Rhys Perry 305cffa22b aco: use s_round_mode/s_denorm_mode
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5773>
2020-07-17 16:40:47 +00:00
Daniel Schürmann 5c0f82b0d7 aco: fix partial copies on GFX6/7
While we don't allow partial subdword copies,
we still need to be able to split 64bit registers

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5663>
2020-06-26 19:21:57 +00:00
Rhys Perry 4fc0499049 aco: remove outdated assert in handle_operands()
"target" is no longer expected to be completely inside "swap".

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5626>
2020-06-24 20:38:35 +00:00
Daniel Schürmann 91fd53884d aco: don't allow partial copies on GFX6/7
These are not supported due to missing SDWA instructions

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207>
2020-06-24 10:52:28 +00:00
Daniel Schürmann 76b5d72921 aco: align swap operations to 4 bytes on GFX6/7
GFX6/7 can only swap full registers

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5207>
2020-06-24 10:52:28 +00:00
Samuel Pitoiset 8c144482ea aco: replace == GFX10 with >= GFX10 where it's needed
Assume the GFX10.3 ISA is similar to GFX10 which is likely (except
possible minor changes and new instructions for raytracing).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5389>
2020-06-19 08:18:39 +02:00
Rhys Perry a02e7f6799 aco: fix encoding of certain s_setreg_imm32_b32 instructions
If the mode is too small, the operand will be an inline constant and the
literal dword won't be written.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
2020-06-15 18:24:22 +00:00
Rhys Perry 3d6f67950d aco: improve 8/16-bit constants
fossil-db (Navi, fp16 enabled):
Totals from 1 (0.00% of 127638) affected shaders:
CodeSize: 4540 -> 4388 (-3.35%)
Instrs: 861 -> 830 (-3.60%)
Cycles: 3444 -> 3320 (-3.60%)
VMEM: 489 -> 465 (-4.91%)
SMEM: 107 -> 110 (+2.80%)
SClause: 31 -> 30 (-3.23%)
Copies: 58 -> 54 (-6.90%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5245>
2020-06-15 18:24:22 +00:00
Daniel Schürmann db957f9135 aco: optimize packing of 16bit subdword registers on GFX6/7
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
2020-06-09 21:25:38 +00:00
Daniel Schürmann 2a51840c52 aco: skip partial copies on first iteration when lowering to hw
Helps some Detroit : Become Human shaders.

Totals from affected shaders: (VEGA)
Code Size: 47693912 -> 47670212 (-0.05 %) bytes
Instructions: 9183788 -> 9177863 (-0.06 %)
Copies: 910052 -> 904127 (-0.65 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
2020-06-09 21:25:38 +00:00
Daniel Schürmann 1d6f667193 aco: coalesce copies more aggressively when lowering to hw
Helps some Detroit : Become Human shaders.

Totals from affected shaders: (VEGA)
Code Size: 9880420 -> 9879088 (-0.01 %) bytes
Instructions: 1918553 -> 1918220 (-0.02 %)
Copies: 177783 -> 177450 (-0.19 %)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
2020-06-09 21:25:38 +00:00
Daniel Schürmann b21d2d9a9f aco: add and use scratch SGPR to lower subdword p_create_vector on GFX6/7
This is needed to lower some corner cases correctly,
in case the same operand occurs multiple times:
e.g. v0 = p_create_vector(v0[0:8], v0[0:8], v0[0:8], v0[0:8])

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
2020-06-09 21:25:38 +00:00
Daniel Schürmann 9e8e12ea6d aco: adjust GFX6 subdword lowering workarounds for 8bit
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
2020-06-09 21:25:38 +00:00
Daniel Schürmann b083581010 aco: Workarounds subdword lowering on GFX6/7
As there are no SDWA instructions, we need to take care not to overwrite
the upper bits of other copy_operation's operands.

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
2020-06-09 21:25:38 +00:00