broadcom/compiler: do not DCE ldunifa
ldunifa reads a uniform from the unifa address and updates the unifa address implicitly, so if we dead-code-eliminate one a follow-up ldunifa will not read from the appropriate address. We could avoid this if the compiler ensures that every ldunifa is paired with an explicit unifa, so for example if we are reading a vec4, we could emit: unifa (addrr) ldunifa unifa (addr+4) ldunifa unifa (addr+8) ldunifa unifa (addr+12) ldunifa instead of: unifa (addr) ldunifa ldunifa ldunifa ldunifa But since each unifa has a 3 delay slot before we can do ldunifa, that would end up being quite expensive. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8980>
This commit is contained in:
parent
efc75e13ea
commit
c2a04aca48
|
@ -84,6 +84,17 @@ vir_has_side_effects(struct v3d_compile *c, struct qinst *inst)
|
|||
return true;
|
||||
}
|
||||
|
||||
/* ldunifa works like ldunif: it reads an element and advances the
|
||||
* pointer, so each read has a side effect (we don't care for ldunif
|
||||
* because we reconstruct the uniform stream buffer after compiling
|
||||
* with the surviving uniforms), so allowing DCE to remove
|
||||
* one would break follow-up loads. We could fix this by emiting a
|
||||
* unifa for each ldunifa, but each unifa requires 3 delay slots
|
||||
* before a ldunifa, so that would be quite expensive.
|
||||
*/
|
||||
if (inst->qpu.sig.ldunifa || inst->qpu.sig.ldunifarf)
|
||||
return true;
|
||||
|
||||
return false;
|
||||
}
|
||||
|
||||
|
|
Loading…
Reference in New Issue