nv50/ir: move LateAlgebraicOpt to the very end

Memory loads can take offsets, but the SHLADD will often attempt to consume the offsets too. As there may be multiple memory loads with the same base but different offsets, those would end up in a SHLADD instead of the offset of the memory operation. This moves the pass after we've had a chance to attempt to propagate immediate adds into the indirect offset. total instructions in shared programs : 6580681 -> 6567716 (-0.20%) total gprs used in shared programs : 944261 -> 943375 (-0.09%) total shared used in shared programs : 0 -> 0 (0.00%) total local used in shared programs : 15328 -> 15328 (0.00%) total bytes used in shared programs : 60339896 -> 60221504 (-0.20%) local shared gpr inst bytes helped 0 0 555 2698 2698 hurt 0 0 138 336 336 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2017-11-16 01:48:20 -05:00 · 2017-11-16 01:48:20 -05:00 · 0bd83d0461
parent 3072bbef63
commit 0bd83d0461
1 changed files with 1 additions and 1 deletions
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_peephole.cpp
@ -3794,11 +3794,11 @@ Program::optimizeSSA(int level)
   RUN_PASS(2, AlgebraicOpt, run);
   RUN_PASS(2, ModifierFolding, run); // before load propagation -> less checks
   RUN_PASS(1, ConstantFolding, foldAll);
-   RUN_PASS(2, LateAlgebraicOpt, run);
   RUN_PASS(1, Split64BitOpPreRA, run);
   RUN_PASS(1, LoadPropagation, run);
   RUN_PASS(1, IndirectPropagation, run);
   RUN_PASS(2, MemoryOpt, run);
+   RUN_PASS(2, LateAlgebraicOpt, run);
   RUN_PASS(2, LocalCSE, run);
   RUN_PASS(0, DeadCodeElim, buryAll);