From c5be4445004e4980a1897b904fc206b3d030c58f Mon Sep 17 00:00:00 2001 From: Connor Abbott Date: Sat, 16 Jul 2022 22:58:59 +0200 Subject: [PATCH] tu: Treat CP_WAIT_FOR_ME as a cache invalidate The workaround for draws that need a CP_WAIT_FOR_ME didn't work if the barrier before the draw is in a separate command buffer from the draw. The barrier would add a pending CP_WAIT_FOR_ME, but it would get dropped on the floor at the end of the command buffer and the draw wouldn't have a pending CP_WAIT_FOR_ME so it wouldn't emit one. We don't know in the barrier if the destination is a draw with the workaround, so we have two options: - Emit any pending CP_WAIT_FOR_ME at the end of the command buffer (and before secondaries) in case there is a workaround draw later. This will emit an extra CP_WAIT_FOR_ME at the end of the command buffer in case there is an indirect command barrier. - Always assume at the beginning of the command buffer that there is a pending CP_WAIT_FOR_ME. This will emit an extra CP_WAIT_FOR_ME before the first workaround-requiring draw in the command buffer, in case there was a barrier earlier. The only draws requiring a workaround are currently vkCmdDraw*IndirectCount(), which we assume are rarer than indirect command barriers, so we implement the second option. This entails treating it as a cache invalidate. This fixes some upcoming dynamic rendering CTS tests that do vkCmdDrawIndirectCount() in a secondary but put the barrier for it in the primary. Fixes: 37939e9c546 ("turnip: Fix the lack of WFM before indirect draws") Part-of: --- src/freedreno/vulkan/tu_private.h | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/src/freedreno/vulkan/tu_private.h b/src/freedreno/vulkan/tu_private.h index d605268095b..9828693e722 100644 --- a/src/freedreno/vulkan/tu_private.h +++ b/src/freedreno/vulkan/tu_private.h @@ -1082,7 +1082,14 @@ enum tu_cmd_flush_bits { TU_CMD_FLAG_ALL_INVALIDATE = TU_CMD_FLAG_CCU_INVALIDATE_DEPTH | TU_CMD_FLAG_CCU_INVALIDATE_COLOR | - TU_CMD_FLAG_CACHE_INVALIDATE, + TU_CMD_FLAG_CACHE_INVALIDATE | + /* Treat CP_WAIT_FOR_ME as a "cache" that needs to be invalidated when a + * a command that needs CP_WAIT_FOR_ME is executed. This means we may + * insert an extra WAIT_FOR_ME before an indirect command requiring it + * in case there was another command before the current command buffer + * that it needs to wait for. + */ + TU_CMD_FLAG_WAIT_FOR_ME, }; /* Changing the CCU from sysmem mode to gmem mode or vice-versa is pretty