aco/ngg: Place workgroup barrier outside control flow for NGG GS.

Merged shaders have a workgroup barrier which makes sure that
the first half is completed in every wave before the 2nd half
is started.

This barrier is located in divergent control flow, so that waves
that don't have any invocations in the 2nd half can finish as early
as possible. This is problematic for NGG GS because it has more
workgroup barriers after the 2nd half.

So, for NGG GS we need to put the barrier outside
control flow because otherwise the waves that have 0 GS threads
won't be able to wait for the waves which have non-zero GS threads.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6964>
This commit is contained in:
Timur Kristóf 2020-10-02 10:29:27 +02:00
parent 1129575d5e
commit df62c8fbea
1 changed files with 9 additions and 1 deletions

View File

@ -11381,6 +11381,13 @@ void select_program(Program *program,
bool check_merged_wave_info = ctx.tcs_in_out_eq ? i == 0 : ((shader_count >= 2 && !empty_shader) || ngg_no_gs);
bool endif_merged_wave_info = ctx.tcs_in_out_eq ? i == 1 : check_merged_wave_info;
if (i && ngg_gs) {
/* NGG GS waves need to wait for each other after the GS half is done. */
Builder bld(ctx.program, ctx.block);
create_workgroup_barrier(bld);
}
if (check_merged_wave_info) {
Temp cond = merged_wave_info_to_mask(&ctx, i);
begin_divergent_if_then(&ctx, &ic_merged_wave_info, cond);
@ -11389,7 +11396,8 @@ void select_program(Program *program,
if (i) {
Builder bld(ctx.program, ctx.block);
create_workgroup_barrier(bld);
if (!ngg_gs)
create_workgroup_barrier(bld);
if (ctx.stage == vertex_geometry_gs || ctx.stage == tess_eval_geometry_gs) {
ctx.gs_wave_id = bld.sop2(aco_opcode::s_bfe_u32, bld.def(s1, m0), bld.def(s1, scc), get_arg(&ctx, args->merged_wave_info), Operand((8u << 16) | 16u));