aco: propagate SGPRs into VOP1 instructions early.

This helps DCE. We should reconsider our optimization order
or maybe do the dead code analysis twice

Totals from 106 (0.08% of 136546) affected shaders (RAVEN):
SGPRs: 7184 -> 7152 (-0.45%)
CodeSize: 736912 -> 736052 (-0.12%)
Instrs: 145739 -> 145509 (-0.16%)
Cycles: 2085344 -> 2084268 (-0.05%)
VMEM: 14819 -> 14807 (-0.08%)
SMEM: 7109 -> 7100 (-0.13%); split: +0.04%, -0.17%
SClause: 5383 -> 5385 (+0.04%)
Copies: 13290 -> 13189 (-0.76%)
PreSGPRs: 5265 -> 5221 (-0.84%)

Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6777>
This commit is contained in:
Daniel Schürmann 2020-09-18 00:00:38 +01:00 committed by Marge Bot
parent 3424e17b9a
commit d887eb141b
1 changed files with 5 additions and 0 deletions

View File

@ -877,6 +877,11 @@ void label_instruction(opt_ctx &ctx, Block& block, aco_ptr<Instruction>& instr)
instr->operands[i].setTemp(info.temp);
info = ctx.info[info.temp.id()];
}
/* applying SGPRs to VOP1 doesn't increase code size and DCE is helped by doing it earlier */
if (info.is_temp() && info.temp.type() == RegType::sgpr && can_apply_sgprs(instr) && instr->operands.size() == 1) {
instr->operands[i].setTemp(info.temp);
info = ctx.info[info.temp.id()];
}
/* for instructions other than v_cndmask_b32, the size of the instruction should match the operand size */
unsigned can_use_mod = instr->opcode != aco_opcode::v_cndmask_b32 || instr->operands[i].getTemp().bytes() == 4;