i965/vec4: Fix saturation errors when coalescing registers

If the register types do not match and the instruction
that contains the final destination is saturated, register
coalescing generated non-equivalent code.

This did not happen when using IR because types usually
matched, but it is visible in nir-vec4.

For example,
   mov      vgrf7:D vgrf2:D
   mov.sat  m4:F vgrf7:F

is coalesced to:
   mov.sat  m4:D vgrf2:D

The patch prevents coalescing in such scenario, unless the
instruction we want to coalesce into is a MOV (without type
conversion implied). In that case, the patch sets the register
types to the type of the final destination.

Shader-db results in HSW (only vec4 instructions shown):

total instructions in shared programs: 1754415 -> 1754416 (0.00%)
instructions in affected programs:     74 -> 75 (1.35%)
helped:                                0
HURT:                                  1
GAINED:                                0
LOST:                                  0

Only one extra instruction in one of the shaders, that comes from
eliminating a saturation error by preventing register coalesce.

Cc: "10.6 11.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
This commit is contained in:
Antia Puentes 2015-08-05 15:57:33 +02:00 committed by Iago Toral Quiroga
parent d1bce52e13
commit 79f1a7ae28
1 changed files with 21 additions and 0 deletions

View File

@ -1065,6 +1065,17 @@ vec4_visitor::opt_register_coalesce()
}
}
/* This doesn't handle saturation on the instruction we
* want to coalesce away if the register types do not match.
* But if scan_inst is a non type-converting 'mov', we can fix
* the types later.
*/
if (inst->saturate &&
inst->dst.type != scan_inst->dst.type &&
!(scan_inst->opcode == BRW_OPCODE_MOV &&
scan_inst->dst.type == scan_inst->src[0].type))
break;
/* If we can't handle the swizzle, bail. */
if (!scan_inst->can_reswizzle(inst->dst.writemask,
inst->src[0].swizzle,
@ -1142,6 +1153,16 @@ vec4_visitor::opt_register_coalesce()
scan_inst->dst.file = inst->dst.file;
scan_inst->dst.reg = inst->dst.reg;
scan_inst->dst.reg_offset = inst->dst.reg_offset;
if (inst->saturate &&
inst->dst.type != scan_inst->dst.type) {
/* If we have reached this point, scan_inst is a non
* type-converting 'mov' and we can modify its register types
* to match the ones in inst. Otherwise, we could have an
* incorrect saturation result.
*/
scan_inst->dst.type = inst->dst.type;
scan_inst->src[0].type = inst->src[0].type;
}
scan_inst->saturate |= inst->saturate;
}
scan_inst = (vec4_instruction *)scan_inst->next;