turnip: Fix the lack of WFM before indirect draws

We have to add WFM to pending bits when we are flushing into CP
for indirect draw to know when they should apply WFM workaround.

Fixes CTS tests:
dEQP-VK.draw.renderpass.indirect_draw.*_data_from_compute.indirect_draw_count*

Fixes: abf0ae014a
("tu: Properly handle waiting on an earlier pipeline stage")

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15577>
This commit is contained in:
Danylo Piliaiev 2022-03-25 15:26:52 +02:00 committed by Marge Bot
parent 1994f1404e
commit 37939e9c54
1 changed files with 7 additions and 1 deletions

View File

@ -2868,6 +2868,9 @@ tu_flush_for_stage(struct tu_cache_state *cache,
* for any WFI's to finish. This is already done for draw calls, including
* before indirect param reads, for the most part, so we just need to WFI.
*
* However, some indirect draw opcodes, depending on firmware, don't have
* implicit CP_WAIT_FOR_ME so we have to handle it manually.
*
* Transform feedback counters are read via CP_MEM_TO_REG, which implicitly
* does CP_WAIT_FOR_ME, but we still need a WFI if the GPU writes it.
*
@ -2879,8 +2882,11 @@ tu_flush_for_stage(struct tu_cache_state *cache,
* future, or if CP_DRAW_PRED_SET grows the capability to do 32-bit
* comparisons, then this will have to be dealt with.
*/
if (src_stage > dst_stage)
if (src_stage > dst_stage) {
cache->flush_bits |= TU_CMD_FLAG_WAIT_FOR_IDLE;
if (dst_stage == TU_STAGE_CP)
cache->pending_flush_bits |= TU_CMD_FLAG_WAIT_FOR_ME;
}
}
static enum tu_cmd_access_mask