pan/bi: Align accesses with packed TLS

When lowering vars to scratch, we need to be careful with alignment on Valhall,
where packed TLS access must not straddle a 16-byte boundary. Fixes regressions
when enabling indirect access to temps on Valhall.

Fixes: 6761dbf891 ("panfrost: Use packed TLS on Valhall")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
This commit is contained in:
Alyssa Rosenzweig 2022-06-16 19:14:58 -04:00 committed by Marge Bot
parent 5ee1179c94
commit 1a882ecdab
1 changed files with 10 additions and 0 deletions

View File

@ -4835,9 +4835,19 @@ bi_finalize_nir(nir_shader *nir, unsigned gpu_id, bool is_blend)
/* Get rid of any global vars before we lower to scratch. */
NIR_PASS_V(nir, nir_lower_global_vars_to_local);
/* Valhall introduces packed thread local storage, which improves cache
* locality of TLS access. However, access to packed TLS cannot
* straddle 16-byte boundaries. As such, when packed TLS is in use
* (currently unconditional for Valhall), we force vec4 alignment for
* scratch access.
*/
bool packed_tls = (gpu_id >= 0x9000);
/* Lower large arrays to scratch and small arrays to bcsel (TODO: tune
* threshold, but not until addresses / csel is optimized better) */
NIR_PASS_V(nir, nir_lower_vars_to_scratch, nir_var_function_temp, 16,
packed_tls ?
glsl_get_vec4_size_align_bytes :
glsl_get_natural_size_align_bytes);
NIR_PASS_V(nir, nir_lower_indirect_derefs, nir_var_function_temp, ~0);