i965/fs: Don't disable SIMD16 when using the pixel interpolator
There was a comment saying that in SIMD16 mode the pixel interpolator returns coords interleaved 8 channels at a time and that this requires extra work to support. However, this interleaved format is exactly what the PLN instruction requires so I don't think anything needs to be done to support it apart from removing the line to disable it and to ensure that the message lengths for the send message are correct. I am more convinced that this is correct because as it says in the comment this interleaved output is identical to what is given in the thread payload. The code generated to apply the plane equation to these coordinates is identical on SIMD16 and SIMD8 except that the dispatch width is larger which implies no special unmangling is needed. Perhaps the confusion stems from the fact that the description of the PLN instruction in the IVB PRM seems to imply that the src1 inputs are not interleaved so it wouldn't work. However, in the HSW and BDW PRMs, the pseudo-code is different and looks like it expects the interleaved format. Mesa doesn't seem to generate different code on IVB to uninterleave the payload registers and everything is working so I can only assume that the PRM is wrong. I tested the interpolateAt tests on HSW and did a full Piglit run on IVB on there were no regressions. Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
This commit is contained in:
parent
89bd5ee64c
commit
7abc1e3286
|
@ -1481,12 +1481,6 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
|||
case nir_intrinsic_interp_var_at_centroid:
|
||||
case nir_intrinsic_interp_var_at_sample:
|
||||
case nir_intrinsic_interp_var_at_offset: {
|
||||
/* in SIMD16 mode, the pixel interpolator returns coords interleaved
|
||||
* 8 channels at a time, same as the barycentric coords presented in
|
||||
* the FS payload. this requires a bit of extra work to support.
|
||||
*/
|
||||
no16("interpolate_at_* not yet supported in SIMD16 mode.");
|
||||
|
||||
fs_reg dst_xy = bld.vgrf(BRW_REGISTER_TYPE_F, 2);
|
||||
|
||||
/* For most messages, we need one reg of ignored data; the hardware
|
||||
|
@ -1551,7 +1545,7 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
|||
bld.SEL(offset(src, bld, i), itemp, fs_reg(7)));
|
||||
}
|
||||
|
||||
mlen = 2;
|
||||
mlen = 2 * dispatch_width / 8;
|
||||
inst = bld.emit(FS_OPCODE_INTERPOLATE_AT_PER_SLOT_OFFSET, dst_xy, src,
|
||||
fs_reg(0u));
|
||||
}
|
||||
|
@ -1563,7 +1557,8 @@ fs_visitor::nir_emit_intrinsic(const fs_builder &bld, nir_intrinsic_instr *instr
|
|||
}
|
||||
|
||||
inst->mlen = mlen;
|
||||
inst->regs_written = 2; /* 2 floats per slot returned */
|
||||
/* 2 floats per slot returned */
|
||||
inst->regs_written = 2 * dispatch_width / 8;
|
||||
inst->pi_noperspective = instr->variables[0]->var->data.interpolation ==
|
||||
INTERP_QUALIFIER_NOPERSPECTIVE;
|
||||
|
||||
|
|
Loading…
Reference in New Issue