intel/fs/gen6: Use SEL instead of bashing thread payload for unlit centroid workaround.

This prevents regressions on SNB due to the redundant MOVs lying
around in cases where fetch_payload_reg() returns a VGRF (currently
only in SIMD32 but soon in pretty much all cases).  The MOVs can't be
register-coalesced due to their source being a FIXED_GRF, and they
can't be copy-propagated either due to the unlit centroid workaround
partial writes.  They can be copy-propagated just fine into a SEL
instruction though.

On SNB this prevents the following shader-db regressions (including
SIMD32 programs) in combination with the interpolation rework part of
this series:

   total instructions in shared programs: 13996898 -> 14001982 (0.04%)
   instructions in affected programs: 197461 -> 202545 (2.57%)
   helped: 0
   HURT: 1251

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This commit is contained in:
Francisco Jerez 2020-01-03 15:58:05 -08:00
parent 0dd18d70ae
commit 9c9e80103c
1 changed files with 8 additions and 5 deletions

View File

@ -351,14 +351,17 @@ fs_visitor::emit_interpolation_setup_gen6()
if (!(centroid_modes & (1 << i)))
continue;
const fs_reg centroid_delta_xy = delta_xy[i];
const fs_reg &pixel_delta_xy = delta_xy[i - 1];
for (unsigned q = 0; q < dispatch_width / 8; q++) {
for (unsigned c = 0; c < 2; c++) {
delta_xy[i] = bld.vgrf(BRW_REGISTER_TYPE_F, 2);
for (unsigned c = 0; c < 2; c++) {
for (unsigned q = 0; q < dispatch_width / 8; q++) {
const unsigned idx = c + (q & 2) + (q & 1) * dispatch_width / 8;
set_predicate_inv(
BRW_PREDICATE_NORMAL, true,
bld.half(q).MOV(horiz_offset(delta_xy[i], idx * 8),
set_predicate(BRW_PREDICATE_NORMAL,
bld.half(q).SEL(horiz_offset(delta_xy[i], idx * 8),
horiz_offset(centroid_delta_xy, idx * 8),
horiz_offset(pixel_delta_xy, idx * 8)));
}
}