Commit Graph

647 Commits

Author SHA1 Message Date
Bas Nieuwenhuizen 9475782eac radv: Implement VK_KHR_imageless_framebuffer.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 22:35:19 +02:00
Bas Nieuwenhuizen a7041f3b4e radv: Store image view also outside framebuffer.
So we can use it with imageless framebuffers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 22:19:16 +02:00
Bas Nieuwenhuizen 49e6c2fb78 radv: Store color/depth surface info in attachment info instead of framebuffer.
That way we can use it for imageless framebuffers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 22:18:51 +02:00
Samuel Pitoiset 7368000868 radv: re-apply "Optimize rebinding the same descriptor set."
This makes it cheaper to just change the dynamic offsets with
the same descriptor sets.

This optimization has been reverted a while back because of
random GPU hangs on GFX9, no it looks fine, at least CTS no longer
hangs on GFX9 and it doesn't hang on GFX10 as well.

It fixes a performance problem with Wolfenstein Youngblood.

Suggested-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
2019-08-02 09:56:55 +02:00
Samuel Pitoiset 0e1724af61 radv/gfx10: implement a bug workaround for NGG -> legacy transitions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-31 12:14:29 +02:00
Samuel Pitoiset 29cca5f381 radv: skip draw calls with 0-sized index buffers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-31 12:14:29 +02:00
Eric Engestrom abc226cf41 tree-wide: replace MAYBE_UNUSED with ASSERTED
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-31 09:41:05 +01:00
Samuel Pitoiset 372c3dcfdb radv: implement VK_EXT_index_type_uint8
Natively supported on VI+.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-29 23:36:53 +02:00
Samuel Pitoiset fd195d8085 radv/gfx10: update streamout descriptors
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-24 08:23:27 +02:00
Bas Nieuwenhuizen 4058b354c5 radv: Set FLUSH_ON_BINNING_TRANSITION.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-23 21:26:59 +02:00
Bas Nieuwenhuizen 906fcfccfd radv: Use pbb_allow for framebuffer BREAK_BATCH.
Ported from radeonsi.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-07-23 21:26:59 +02:00
Samuel Pitoiset 8c97a07967 radv/gfx10: do not allocate space for the ZPASS_DONE bug
GFX10 isn't affected.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-22 09:02:35 +02:00
Samuel Pitoiset e7c356866e radv: change a bunch of >= GFX9 to == GFX9
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-22 09:02:26 +02:00
Samuel Pitoiset ed53d2c4be radv/gfx10: disable the TC compat zrange workaround
Unnecessary.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-17 08:32:36 +02:00
Samuel Pitoiset ae4b1fc095 radv/gfx10: always build the GS copy shader but uses it on-demand
It should be possible to build it on-demand too but it requires
more work. On GFX10, the GS copy shader is required when tess
is enabled with extreme geometry.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-17 08:32:30 +02:00
Samuel Pitoiset afa102d65b radv: add radv_emit_streamout_{begin,end} helpers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-07-16 11:17:00 +02:00
Samuel Pitoiset 4dcdc4cdc5 radv: allow to select DST_SEL with RELEASE_MEM
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-07-16 11:16:57 +02:00
Samuel Pitoiset b393b2ce95 radv/gfx10: emit DISABLE_CONSERVATIVE_ZPASS_COUNTS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-12 17:47:12 +02:00
Samuel Pitoiset 37aefb2be1 radv/gfx10: invalidate everything in L2 when shaders read data
This includes metadata as well. On GFX10, we have to invalidate
the L2 metadata cache when shaders read DCC.

Note that we still have to implement GFX10 coherency by
introducing INV_L2_METATADA but for now just flush L2.

This fixes a corruption with DCC and Talos.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-12 14:08:12 +02:00
Samuel Pitoiset ffd6a979bf radv/gfx10: update OVERWRITE_COMBINER_{MRT_SHARING,WATERMARK}
DCC related, mirror RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com
2019-07-12 08:19:53 +02:00
Bas Nieuwenhuizen 45b73b3aa9 radv/gfx10: Do not allocate a gs_copy_shader on gfx10.
Will use ngg for any gs anyway.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-11 15:45:47 +02:00
Bas Nieuwenhuizen d0978427cb radv/gfx10: Use new uconfig reg index packet for GFX10+.
Otherwise the hardware/firmware seems to not set the registers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-07 17:51:32 +02:00
Samuel Pitoiset 67b6888d8b radv/gfx10: emit GE_CNTL instead of IA_MULTI_VGT_PARAM for legacy mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:51:32 +02:00
Samuel Pitoiset 12a42c2d9f radv/gfx10: implement radv_flush_vertex_descriptors() change
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:51:32 +02:00
Samuel Pitoiset ebeb319f0e radv/gfx10: implement radv_CmdBindDescriptorSets()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:51:31 +02:00
Samuel Pitoiset 2435b571de radv/gfx10: update DB_Z_INFO register
GFX10 uses the same register as GFX8.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:03:38 +02:00
Samuel Pitoiset 17048c1765 radv/gfx10: implement radv_emit_fb_ds_state()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:03:38 +02:00
Samuel Pitoiset c2a5d98148 radv/gfx10: implement radv_emit_fb_color_state()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:03:38 +02:00
Bas Nieuwenhuizen c6cb9b197d radv: Support VK_EXT_queue_family_foreign.
Basically same as external for now.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Only case we might need to handle differently in the near future
is Raven's case of displayable DCC which is not renderable. But
we don't support that yet.
2019-07-03 10:56:21 +00:00
Samuel Pitoiset 6baa453dd5 radv: remove unused code in radv_update_tc_compat_zrange_metadata()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 08:51:58 +02:00
Samuel Pitoiset e41e575e24 radv: implement clearing DCC layers on GFX8
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-02 09:37:56 +02:00
Samuel Pitoiset cc50c85e13 radv: make sure to mark the image as compressed when clearing DCC levels
Found while working on DCC for arrays.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-01 14:58:56 +02:00
Samuel Pitoiset 5d6d29ed5d radv: add si_emit_ia_multi_vgt_param() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-28 08:40:42 +02:00
Samuel Pitoiset 8ea7ee1536 radv: rename and re-document cache flush flags
SMEM and VMEM caches are L0 on gfx10. Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-25 18:38:37 +02:00
Samuel Pitoiset 5411f47056 radv: set DISABLE_CONSTANT_ENCODE_REG to 1 for Raven2
Ported from RadeonSI, will be emitted for GFX10 too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-25 16:45:15 +02:00
Samuel Pitoiset 34bef8a0d7 radv: clear CMASK layers instead of the whole buffer on GFX8
This reduces the size of fill operations needed to clear CMASK
for layered color textures.

GFX9 unsupported for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-25 16:36:28 +02:00
Samuel Pitoiset 476b907a3b radv: clear FMASK layers instead of the whole buffer on GFX8
This reduces the size of fill operations needed to clear FMASK
for layered color textures.

GFX9 unsupported for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-25 16:36:25 +02:00
Samuel Pitoiset a5ba386b3f radv: always initialize levels without DCC as fully expanded
This fixes a rendering issue with RoTR/DXVK.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-25 16:36:23 +02:00
Samuel Pitoiset 946193ae00 radv: add support for VK_AMD_buffer_marker
This simple extension might be useful for debugging purposes.
GAPID has support for it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-24 10:50:54 +02:00
Samuel Pitoiset e67fc11c26 radv: pass sample locations for transitions before depth/stencil resolves
HTILE decompressions need the user sample locations if specified
in the current subpass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-21 14:50:35 +02:00
Samuel Pitoiset 203f60ebf2 radv: emit framebuffer state from primary if secondary doesn't inherit it
Otherwise fast color/depth clears can't work because they depend
on the framebuffer.

This fixes the following CTS (when the small hint is disabled):
- dEQP-VK.geometry.layered.1d_array.secondary_cmd_buffer
- dEQP-VK.geometry.layered.2d_array.secondary_cmd_buffer
- dEQP-VK.geometry.layered.cube.secondary_cmd_buffer
- dEQP-VK.geometry.layered.cube_array.secondary_cmd_buffer

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110810
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107986
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-21 13:49:35 +02:00
Samuel Pitoiset dc6e3053a7 radv: initialize levels without DCC during layout transitions
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-20 11:03:49 +02:00
Samuel Pitoiset e91c1ea06c radv: implement compressed FMASK texture reads with RADV_PERFTEST=tccompatcmask
This allows us to disable the FMASK decompress pass when
transitioning from CB writes to shader reads.

This will likely be improved and enabled by default in the future.

No CTS regressions on GFX8 but a few number of multisample CTS
failures on GFX9 (they look related to the small hint).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-19 10:06:39 +02:00
Samuel Pitoiset 864ddda8a3 radv: check if DCC is enabled per mip not for the whole image
In other words, make use of radv_dcc_enabled() instead of
radv_image_has_dcc() all over the places.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-18 11:24:36 +02:00
Samuel Pitoiset 7832e75ea8 radv: load the fast color clear values from the base level
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-17 22:20:53 +02:00
Samuel Pitoiset 7971697efe radv: store the DCC predicate for each mip
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-17 22:20:53 +02:00
Samuel Pitoiset 38aa386e96 radv: store the FCE predicate for each mip
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-17 22:20:53 +02:00
Samuel Pitoiset 7295512037 radv: store the fast color clear values for each mip
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-17 22:20:53 +02:00
Samuel Pitoiset 6880b42cfc radv: silent a compiler warning in radv_CmdPushDescriptorSetKHR()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-17 09:53:26 +02:00
Bas Nieuwenhuizen 0667c1f14b radv: Skip transitions coming from external queue.
Transitions to external queue should do the transition & make sure
it works on all queues.

Fixes: 8ebc7dcb59 "radv: Allow fast clears with concurrent queue mask for some layouts."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-13 08:15:45 +00:00
Samuel Iglesias Gonsálvez 32e1d85cb6 radv: assert on inline uniform blocks in radv_CmdPushDescriptorSetKHR()
According to the Vulkan spec, inline uniform blocks are not allowed
to be updated through vkCmdPushDescriptorSetKHR().

These are the spec quotes from "13.2.1. Descriptor Set Layout"
that are relevant for this case:

"VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR specifies
 that descriptor sets must not be allocated using this layout, and
 descriptors are instead pushed by vkCmdPushDescriptorSetKHR."

"If flags contains
 VK_DESCRIPTOR_SET_LAYOUT_CREATE_PUSH_DESCRIPTOR_BIT_KHR, then all
 elements of pBindings must not have a descriptorType of
 VK_DESCRIPTOR_TYPE_INLINE_UNIFORM_BLOCK_EXT".

There is no explicit mention in vkCmdPushDescriptorSetKHR() to forbid
this case but it is implied in the creation of the descriptor set
layout as aforementioned.

Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-11 16:32:27 +02:00
Bas Nieuwenhuizen 39c71e0025 radv: Prevent out of bound shift on 32-bit builds.
uintptr_t is 32-bits then and shifting it by 32 bits results in undefined
behavior IIRC.

Fixes: b3c8de1c55 "radv: save all descriptor pointers into the trace BO"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-06-10 22:18:51 +00:00
Samuel Pitoiset e7677a697b radv: handle sample locations during automatic layout transitions
From the Vulkan spec 1.1.109:

   "Some implementations may need to evaluate depth image values
    while performing image layout transitions. To accommodate this,
    instances of the VkSampleLocationsInfoEXT structure can be
    specified for each situation where an explicit or automatic
    layout transition has to take place. [...] and
    VkRenderPassSampleLocationsBeginInfoEXT can be chained from
    VkRenderPassBeginInfo to provide sample locations for layout
    transitions performed implicitly by a render pass instance."

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-07 13:11:11 +02:00
Samuel Pitoiset f58e9f6d69 radv: handle sample locations during explicit depth/stencil transitions
From the Vulkan spec 1.1.109,

   "Some implementations may need to evaluate depth image values
    while performing image layout transitions. To accommodate this,
    instances of the VkSampleLocationsInfoEXT structure can be
    specified for each situation where an explicit or automatic
    layout transition has to take place. VkSampleLocationsInfoEXT
    can be chained from VkImageMemoryBarrier structures to provide
    sample locations for layout transitions performed by
    vkCmdWaitEvents and vkCmdPipelineBarrier calls."

This handles explicit depth/stencil layout transitions performed
with CmdWaitEvents() or CmdPipelineBarrier().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-07 13:11:01 +02:00
Samuel Pitoiset a20925f2a9 radv: allow the depth decompress pass to emit dynamic sample locations
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-07 13:11:00 +02:00
Samuel Pitoiset b9d3a6b656 radv: set the subpass before any initial subpass transitions
This might fix initial subpass transitions when multiview is used.
Noticed while implementing sample locations during layout transitions.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-06 10:00:29 +02:00
Nicolai Hähnle f480b8aaa4 amd/common: use generated register header 2019-06-03 20:05:20 -04:00
Samuel Pitoiset da26013eb7 radv: implement VK_EXT_sample_locations and disable it
Basically, this extension allows applications to use custom
sample locations. It doesn't support variable sample locations
during subpass. Note that we don't have to upload the user
sample locations because the spec doesn't allow this.

The extension is currently disabled because the driver needs to
support variable sample locations during layout transitions. The
depth decompress needs to know them and that's a bit invasive.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-30 09:52:16 +02:00
Samuel Pitoiset eaeaad25f7 radv: sync before resetting a pool if there is active pending queries
Make sure to sync all previous work if the given command buffer
has pending active queries. Otherwise the GPU might write queries
data after the reset operation.

This fixes a bunch of new dEQP-VK.query_pool.* CTS failures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-29 08:47:54 +02:00
Samuel Pitoiset 47a10edefb radv: allocate more space in the CS when emitting events
If the driver waits for CP DMA to be idle and emit an EOP event
we need more space.

This fixes a crash with Quake Champions.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-28 16:56:17 +02:00
Samuel Pitoiset 9af15986b0 radv: add radv_clear_htile() helper
This helper will be useful for clearing HTILE after some
depth/stencil resolves.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-27 13:52:34 +02:00
Samuel Pitoiset a7763ddcf2 radv: clean up the sample locations codebase
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-22 08:36:35 +02:00
Samuel Pitoiset daa85a882e radv: decompress FMASK before performing a MSAA decompress using FMASK
This fixes some CTS failures related to VK_EXT_sample_locations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-20 12:41:47 +02:00
Marek Olšák ccfcb9d818 ac: rename SI-CIK-VI to GFX6-GFX7-GFX8
Acked-by: Dave Airlie <airlied@redhat.com>

We already use GFX9 and I don't want us to have confusing naming
in the driver. GFXn naming is better from the driver perspective,
because it's the real version of the gfx portion of the hw. Also,
CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI.

It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have
nothing to do with GFXn and they have their own version numbers.
2019-05-15 20:54:10 -04:00
Samuel Pitoiset 5555db103e radv: bump reported version to 1.1.107
VK_AMD_draw_indirect_count has been promoted with the suffix
changed to KHR.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-13 21:38:01 +02:00
Józef Kucia 24af0f1318 radv: clear vertex bindings while resetting command buffer
Only vertex inputs accessed by vertex shader must have valid buffers
bound.

Signed-off-by: Józef Kucia <joseph.kucia@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Fixes: 5010436e09 "radv: bail out when binding the same vertex buffers"
2019-05-11 02:51:00 +02:00
Samuel Pitoiset 08be23bfde radv: set WD_SWITCH_ON_EOP=1 when drawing primitives from a stream output buffer
According to RadeonSI, this seems to be required by the hardware
to avoid GPU hangs. I think I just forgot to set that bit when I
implemented VK_EXT_transform_feedback.

This fixes a GPU hang with Space Engineers and DXVK.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110291
Fixes: b4eb029062 ("radv: implement VK_EXT_transform_feedback")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-02 15:55:46 +02:00
Samuel Pitoiset 6ac10e07c2 radv: implement a workaround for VK_EXT_conditional_rendering
Per the Vulkan spec 1.1.107, the predicate is a 32-bit value. Though
the AMD hardware treats it as a 64-bit value which means it might
fail to discard.

I don't know why this extension has been drafted like that but this
definitely not fit with AMD. The hardware doesn't seem to support
a 32-bit value for the predicate, so we need to implement a workaround.

This fixes an issue when DXVK enables conditional rendering with RADV,
this also fixes the Sasha conditionalrender demo.

Fixes: e45ba51ea4 ("radv: add support for VK_EXT_conditional_rendering")
Reported-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-02 09:24:05 +02:00
Bas Nieuwenhuizen 42d159f276 radv: Add multiple planes to images.
No functional changes. This temporarily uses plane 0 for
everything.

Long term plan is that only single plane images get to use
metadata like htile/dcc/cmask/fmask.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-04-25 19:56:20 +00:00
Samuel Pitoiset 62a9d757e6 radv: do not always initialize HTILE in compressed state
Especially when performing a transtion from UNDEFINED->GENERAL,
the driver shouldn't initialize HTILE metadata in compressed
state because it doesn't decompress when the src layout is
GENERAL.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110259
Fixes: 3a2e93147f ("radv: always initialize HTILE when the src layout is UNDEFINED")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-29 08:28:18 +01:00
Samuel Pitoiset 6596eb2b30 radv: skip updating depth/color metadata for conditional rendering
I don't think we should update metadata when conditional rendering
is enabled. For some reasons, some CTS breaks only on SI.

This fixes the following CTS on SI:
dEQP-VK.conditional_rendering.draw_clear.clear.depth.*

Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-28 17:37:12 +01:00
Samuel Pitoiset 4fa61273a8 radv: fix binding transform feedback buffers
The mask should be accumulated if two calls are used for
binding two buffers at different indexes. Otherwise, the
driver only accounts for the last one.

Noticed while glancing at this code.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-20 09:06:40 +01:00
Samuel Pitoiset 3a2e93147f radv: always initialize HTILE when the src layout is UNDEFINED
HTILE should always be initialized when transitioning from
VK_IMAGE_LAYOUT_UNDEFINED to other image layouts. Otherwise,
if an app does a transition from UNDEFINED to GENERAL, the
driver doesn't initialize HTILE and it tries to decompress
the depth surface. For some reasons, this results in VM faults.

Cc: mesa-stable@lists.freedesktop.org
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107563
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-14 17:22:23 +01:00
Samuel Pitoiset a66b186beb radv: use typed buffer loads for vertex input fetches
This drastically reduces the number of SGPRs because the driver
now uses descriptors per vertex binding, instead of per vertex
attribute format.

29077 shaders in 15096 tests
Totals:
SGPRS: 1354285 -> 1282109 (-5.33 %)
VGPRS: 909896 -> 908800 (-0.12 %)
Spilled SGPRs: 24840 -> 24811 (-0.12 %)
Code Size: 49221144 -> 48986628 (-0.48 %) bytes
Max Waves: 243930 -> 244229 (0.12 %)

Totals from affected shaders:
SGPRS: 390648 -> 318472 (-18.48 %)
VGPRS: 288432 -> 287336 (-0.38 %)
Spilled SGPRs: 94 -> 65 (-30.85 %)
Code Size: 11548412 -> 11313896 (-2.03 %) bytes
Max Waves: 86460 -> 86759 (0.35 %)

This gives a really tiny boost.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-13 13:31:11 +01:00
Samuel Pitoiset e72daf3e70 Revert "radv: execute external subpass barriers after ending subpasses"
This changes is actually wrong because we have to sync
before doing image layout transitions.

This fixes rendering issues in Batman, Path of Exile and
probably more titles.

This reverts commit 76c17cfd8d.

Fixes: 76c17cfd8d ("radv: execute external subpass barriers after ending subpasses")
Cc: 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-08 14:59:26 +01:00
Samuel Pitoiset c2a148692b radv: properly align the fence and EOP bug VA on GFX9
If alignement is 0, offets returned by
radv_cmd_buffer_upload_alloc() are always 0. These two
virtual addresses were pointing at the same location.

Cc: 18.3 19.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-05 15:00:20 +01:00
Samuel Pitoiset 2eb0905ffa radv: allocate enough space in cmdbuf when starting a subpass
This fixes some CTS crashes with:
dEQP-VK.renderpass2.suballocation.attachment_write_mask.attachment_count_8.start_index_*

Ideally, we should check cmd_buffer->cs->max_dw because there is
likely enough space (the internal clear draws allocate space), but
keep that way for consistency.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-03-05 15:00:10 +01:00
Samuel Pitoiset 1b8983c25b radv: fix using LOAD_CONTEXT_REG with old GFX ME firmwares on GFX8
This fixes a critical issue.

Cc: <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109575
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:39:30 +01:00
Samuel Pitoiset bd1186572f radv: add support for push constants inlining when possible
This removes some scalar loads from shaders, but it increases
the number of SET_SH_REG packets. This is currently basic but
it could be improved if needed. Inlining dynamic offsets might
also help.

Original idea from Dave Airlie.

29077 shaders in 15096 tests
Totals:
SGPRS: 1321325 -> 1357101 (2.71 %)
VGPRS: 936000 -> 932576 (-0.37 %)
Spilled SGPRs: 24804 -> 24791 (-0.05 %)
Code Size: 49827960 -> 49642232 (-0.37 %) bytes
Max Waves: 242007 -> 242700 (0.29 %)

Totals from affected shaders:
SGPRS: 290989 -> 326765 (12.29 %)
VGPRS: 244680 -> 241256 (-1.40 %)
Spilled SGPRs: 1442 -> 1429 (-0.90 %)
Code Size: 8126688 -> 7940960 (-2.29 %) bytes
Max Waves: 80952 -> 81645 (0.86 %)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-12 17:25:54 +01:00
Samuel Pitoiset 0d0affad3c radv: don't flush src stages when dstStageMask == BOTTOM_OF_PIPE
Original patch by Fredrik Höglund.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset 76c17cfd8d radv: execute external subpass barriers after ending subpasses
Outgoing dependencies (ie. external) should happen after the subpass.
This doesn't change anything for subpass resolves as we already
make sure that attachments are shader readable.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:19:14 +01:00
Samuel Pitoiset b509013060 radv: handle final layouts at end of every subpass and render pass
That shouldn't change anything as we check if the last
subpass id is the final subpass.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:18:38 +01:00
Samuel Pitoiset e1a0a268c6 radv: use the new attachments array when starting subpasses
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:57 +01:00
Samuel Pitoiset a20c2e38d8 radv: store the list of attachments for every subpass
This reworks how the depth stencil attachment is used for
simplicity. This also introduces radv_render_pass_compile()
helper that will be used for further optimizations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:54 +01:00
Samuel Pitoiset a7c7d811f1 radv: move subpass image transitions to radv_cmd_buffer_begin_subpass()
Instead of doing them in radv_cmd_buffer_set_subpass().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:52 +01:00
Samuel Pitoiset 291a933786 radv: add radv_cmd_buffer_begin_subpass() helper
To unify some code in BeginRenderPass() and NextSubpass().
Based on Intel ANV driver.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:50 +01:00
Samuel Pitoiset 41199e2eeb radv: remove useless MAYBE_UNUSED in CmdBeginRenderPass()
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:46 +01:00
Samuel Pitoiset 0f932bbede radv: bail out when no image transitions will be performed
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-02-04 13:17:40 +01:00
Bas Nieuwenhuizen ead54d4a42 radv/winsys: Set winsys bo priority on creation.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-29 15:56:41 +01:00
Samuel Pitoiset d1994ed229 radv: remove radv_userdata_info::indirect field
Always false.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-28 16:30:33 +01:00
Samuel Pitoiset 963c044c55 radv: always pass the GFX9 fence data to si_cs_emit_cache_flush()
Remove two useless checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:14 +01:00
Samuel Pitoiset 5f0b17d581 radv: compute the GFX9 fence VA at allocation time
Instead of doing every time we emit cache flushes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:12 +01:00
Samuel Pitoiset e7ac792400 radv: only allocate the GFX9 fence and EOP BOs for the gfx queue
It's invalid to emit a ZPASS_DONE event on the compute queue, and
the fence BO is unused on the compute queue (ie. we don't flush
CB or DB caches).
This saves some space in the upload BO.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:09 +01:00
Samuel Pitoiset bd098884f1 radv: remove old_fence parameter from si_cs_emit_write_event_eop()
This parameter is actually useless as the immediate value
can always be zero.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-23 11:31:07 +01:00
Marek Olšák e402961e1d radeonsi: correct WRITE_DATA.DST_SEL definitions
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-22 12:14:26 -05:00
Rhys Perry f0ba826054 radv: prevent dirtying of dynamic state when it does not change
DXVK often sets dynamic state without actually changing it.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry e4c6423c5e radv: avoid context rolls when binding graphics pipelines
It's common in some applications to bind a new graphics pipeline without
ending up changing any context registers.

This has a pipline have two command buffers: one for setting context
registers and one for everything else. The context register command buffer
is only emitted if it differs from the previous pipeline's.

v2: ensure late scissor emission is done when radv_emit_rbplus_state() is
    called
v2: make use of cmd_buffer->state.workaround_scissor_bug
v3: rename "workaround_scissor_bug" to
    "context_roll_without_scissor_emitted"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry 5564a797f2 radv: add missed situations for scissor bug workaround
v2: rename "workaround_scissor_bug" to
    "context_roll_without_scissor_emitted"

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Rhys Perry 5d1a29071a radv: pass radv_draw_info to radv_emit_draw_registers()
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-21 14:37:53 +00:00
Samuel Pitoiset b8c4f523b4 radv: skip draws with instance_count == 0
Loosely based on RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-01-09 14:22:38 +01:00
Samuel Pitoiset d58b11e709 radv: get rid of bunch of KHR suffixes
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-09 12:26:48 +01:00
Bas Nieuwenhuizen 64c83efaee radv: Remove unused variable.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-01-07 23:15:33 +01:00
Bas Nieuwenhuizen 8c93ef5de9 radv: Do a cache flush if needed before reading predicates.
This caused random failures for two conditional rendering tests:

dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_discard
dEQP-VK.conditional_rendering.draw_clear.draw.update_with_rendering_no_discard

These wrote the predicate with the vertex shader, did a barrier and then
started the conditional rendering. However the cache flushes for the barrier
only happen on first draw, so after the predicate has been read.

Fixes: e45ba51ea4 "radv: add support for VK_EXT_conditional_rendering"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-12-31 20:52:08 +01:00
Samuel Pitoiset 6b976024a8 radv: add support for FMASK expand
Original patch by Dave Airlie.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:17 +01:00
Samuel Pitoiset fa16da53d8 radv: initialize FMASK for images in fully expanded mode
The value depends on the number of samples.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 18:01:15 +01:00
Samuel Pitoiset 5c7935f8fc radv: fix subpass image transitions with multiviews
The driver needs to decompress all image layers if a fast
depth/color clear has been performed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-20 13:36:37 +01:00
Dave Airlie b3f2b03ece radv/xfb: fix counter buffer bounds checks.
If we gave this function 0 counter buffers, we'd still try and
access pCounterBuffers[0] as this check was incorrect.

Fixes crash with ext_transform_feedback-pipeline-basic-primgen
on zink on radv.

Fixes: 677b496b6 (radv: fix begin/end transform feedback with 0 counter buffers.)
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-12-13 19:27:05 +00:00
Samuel Pitoiset 3a5adc2879 radv: add a predicate for reflecting DCC decompression state
It's somehow similar to the FCE predicate.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-13 09:21:10 +01:00
Samuel Pitoiset 3fbdcd942f amd: remove support for LLVM 6.0
User are encouraged to switch to LLVM 7.0 released in September 2018.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-06 14:02:56 +01:00
Samuel Pitoiset 824cfc1ee5 radv: rework the TC-compat HTILE hardware bug with COND_EXEC
After investigating on this, it appears that COND_WRITE doesn't
work correctly in some situations. I don't know exactly why does
it fail to update DB_Z_INFO.ZRANGE_PRECISION, but as AMDVLK
also uses COND_EXEC I think there is a reason.

Now the driver stores a new metadata value in order to reflect
the last fast depth clear state. If a TC-compat HTILE is fast cleared
with 0.0f, we have to update ZRANGE_PRECISION to 0 in order to
work around that hardware bug.

This fixes rendering issues with The Forest and DXVK and doesn't
seem to introduce any regressions.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108914
Fixes: 68dead112e ("radv: update the ZRANGE_PRECISION value for the TC-compat bug")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-12-05 09:26:31 +01:00
Nicolai Hähnle 729ebdf07e radv: remove dependency on addrlib gfx9_enum.h
v2:
- use SI_CONTEXT_REG_OFFSET

Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-29 13:18:23 +01:00
Samuel Pitoiset 6d4f65deea radv: remove unused pending_clears param in the transition path
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-29 10:18:53 +01:00
Samuel Pitoiset 4b9bc4791b radv: only sync CP DMA for transfer operations or bottom pipe
CP DMA can only be busy when the driver copies buffers. The
only affected Vulkan commands are vkCmdCopyBuffer() and
vkCmdUpdateBuffer() (because we fallback to a copy depending on
a threshold). Clear operations are currently not concerned
because the driver always syncs after the last DMA operation.

Per the spec, these two operations have to be externally
synchronized with VK_PIPELINE_STAGE_TRANSFER_BIT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-21 10:03:01 +01:00
Samuel Pitoiset 483a28bfd4 radv: tidy up radv_set_dcc_need_cmask_elim_pred()
This is just a small cleanup.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-19 14:05:33 +01:00
Samuel Pitoiset 90d68858ed radv: set optimal OVERWRITE_COMBINER_WATERMARK on GFX9
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-13 10:24:36 +01:00
Samuel Pitoiset b5f213bb1d radv: binding streamout buffers doesn't change context regs
Cc: 18.3 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-11-13 10:24:31 +01:00
Samuel Pitoiset c472ad82e4 radv: fix GPU hangs when loading depth/stencil clear values on SI/CIK
HTILE is supported on these chips, not sure how I missed that.
This restores using PFP_SYNC_ME when LOAD_CONTEXT_REG is not used.

Fixes: f425d9ee74 ("radv: use LOAD_CONTEXT_REG when loading fast clear values")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-08 11:20:03 +01:00
Samuel Pitoiset f425d9ee74 radv: use LOAD_CONTEXT_REG when loading fast clear values
This avoids syncing the Micro Engine. This is only supported
for VI+ currently. There is probably a way for using
LOAD_CONTEXT_REG on previous chips but that could be done later.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-08 10:41:45 +01:00
Samuel Pitoiset c571ca7a08 radv: replace si_emit_wait_fence() with radv_cp_wait_mem()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-11-05 09:48:50 +01:00
Dave Airlie 677b496b6b radv: fix begin/end transform feedback with 0 counter buffers.
If the user gives 0 counterBuffers then the driver should still
enable transform feedback on all targets. This changes the
driver to always enable xfb, and use counter buffers where
one is defined for the target in question.

Fixes: b4eb029062 (radv: implement VK_EXT_transform_feedback)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-02 04:15:07 +00:00
Dave Airlie 7f37a52a21 radv: apply xfb buffer offset at buffer binding time not later. (v2)
In order to handle pause/resume properly, the offset should
be added to the buffer binding not to the begin/end paths.

v2: don't add offset to size
Fixes ext_transform_feedback-alignment* under zink

Fixes: b4eb029062 (radv: implement VK_EXT_transform_feedback)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-11-02 04:13:31 +00:00
Samuel Pitoiset e60ab66e33 radv: rename some parameters in Cmd{Begin,End}TransformFeedbackEXT()
To match latest spec.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-31 09:21:28 +01:00
Samuel Pitoiset b4eb029062 radv: implement VK_EXT_transform_feedback
This implementation should work and potential bugs can be
fixed during the release candidates window anyway.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-10-29 17:10:58 +01:00
Bas Nieuwenhuizen d41c3cc013 radv: Emit enqueued pipeline barriers on event write.
Since the CPU can read them we need to execute any GPU->CPU
flushes before the event is written.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108524
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-10-25 16:17:54 +02:00
Marek Olšák fa023f293e ac: correct PKT3_COPY_DATA definitions 2018-10-06 21:50:09 -04:00
Samuel Pitoiset 6521d4a659 Revert "radv: Optimize rebinding the same descriptor set."
This introduces random GPU hangs on Vega, at least.

This reverts commit 02a43edf18.
2018-09-17 11:20:57 +02:00
Bas Nieuwenhuizen 02a43edf18 radv: Optimize rebinding the same descriptor set.
This makes it cheaper to just change the dynamic offsets with
the same descriptor sets.

Suggested-by: Philip Rebohle <philip.rebohle@tu-dortmund.de>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-16 12:50:19 +02:00
Samuel Pitoiset c79aad30ae radv: emit the initial config only once in the preambles
It shouldn't be needed to emit the initial graphics or compute
state when beginning a new command buffer. Emitting them in
the preamble should be enough and this will reduce IB sizes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset 748f4cce18 radv: fix flushing indirect descriptors
Let say, we first bind a graphics pipeline that needs indirect
descriptors sets. The userdata pointers will be emitted at draw
time. Then if we bind a compute pipeline that doesn't need any
indirect descriptors, the driver will re-emit them for all
grpahics stages.

To avoid this to happen, just check the bind point type.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset 063264db5b radv: fix GPU hangs with 32-bit indirect descriptors
LLVM 6 isn't affected.

Fixes GPU hangs with new CTS:
dEQP-VK.binding_model.descriptorset_random.*

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset abdf396cbe radv: fix VK_EXT_conditional_rendering visibility
It's actually just the opposite.

This fixes the new Sascha conditionalrender demo.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-09-14 10:59:52 +02:00
Samuel Pitoiset d4bf954fe6 radv: fix function names for VK_EXT_conditional_rendering
Otherwise they are not exported.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-09-13 16:03:18 +02:00
Samuel Pitoiset 87fbc16e34 radv/gfx9: implement coherent shaders for VK_ACCESS_SHADER_READ_BIT
Single-sample color and single-sample depth (not stencil)
are coherent with shaders.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl
2018-08-23 15:42:56 +02:00
Samuel Pitoiset f9e8456c39 radv: initialize the DCC predicate correctly when it's compressed
We have to do a fast-clear eliminate when clearing DCC
metadata with 0x20202020. I don't know if that fixes anything
but that seems correct to me.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-16 14:11:51 +02:00
Samuel Pitoiset f3a78a9da0 radv: fix missing initialization of the conditional rendering state
This was missing when VK_EXT_conditional_rendering has been
implemented. The predication type should be -1 to avoid
restoring previous state when performing a decompression pass
with DCC enabled.

Note that we don't have to handle secondary command buffers
because we don't support this feature currently.

CC: 18.2 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-08-16 14:11:48 +02:00
Dave Airlie 5040319331 radv: fix cdw check vs tracing emit
If we have tracing enabled we could do all the tracing emits
and overflow the precalculated cdw_max.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-28 06:20:27 +10:00
Samuel Pitoiset df679b1643 radv: allocate enough space in radv_cmd_buffer_after_draw()
The driver might emit up to 4 dwords when RADV_TRACE_FILE is
used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-27 14:31:29 +02:00
Samuel Pitoiset c08ae911d9 radv: check CS space in radv_emit_write_data_packet()
This wasn't wrong but it looks better to me like this. It's
only used for debugging purposes (ie. RADV_TRACE_FILE).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-27 14:31:27 +02:00
Samuel Pitoiset c118c8938c radv: reduce CB/DB meta flushes in radv_dst_access_flush()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-27 14:31:24 +02:00
Samuel Pitoiset ce454d02cc radv: simplify a condition in radv_src_access_flush()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-20 10:17:17 +02:00
Samuel Pitoiset 0a8127bbfb radv: make use of radv_subpass_barrier() when resolving subpasses
The goal is to use radv_barrier()/radv_subpass_barrier() as
much as possible for further optimizations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-20 10:17:11 +02:00
Bas Nieuwenhuizen 978570769d radv: Always set disable zpass increment bit when possible.
When no occlusion queries are active even if out of order is enabled.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-19 02:38:10 +02:00
Bas Nieuwenhuizen c0144e915a radv: Disable disabled color buffers in rbplus opts.
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-07-19 02:37:47 +02:00
Samuel Pitoiset e45ba51ea4 radv: add support for VK_EXT_conditional_rendering
Inherited commands buffers are not supported.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-18 13:44:09 +02:00
Samuel Pitoiset 4d99caf590 radv: set the predicate for indirect/indexed draw commands
VK_EXT_conditional_rendering allows to discard draw commands
(not only normal draws).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-18 13:44:04 +02:00
Samuel Pitoiset 1e83f65673 radv: set the predicate for dispatch commands
VK_EXT_conditional_rendering allows to discard dispatch commands.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-18 13:44:01 +02:00
Samuel Pitoiset d9526384bd radv: optimize radv_stage_flush() for pre fragment shader stages
We don't need to emit PS_PARTIAL_FLUSH for the pre fragment shader
stages (ie. geometry/tessellation). Emitting VS_PARTIAL_FLUSH
is enough for these stages. Note that PS_PARTIAL_FLUSH also
synchronizes all vertex stages.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-18 10:09:05 +02:00
Samuel Pitoiset 88e56804a7 radv: reduce number of CB/DB meta flushes for VK_ACCESS_TRANSFER_WRITE_BIT
If we know that the given image doesn't have any metadata,
we don't need to flush.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-07-17 09:34:20 +02:00
Samuel Pitoiset f1b3f7bfac radv: simplify the logic in radv_set_descriptor_set()
Now that 'set' can't be NULL because the meta operations no
longer bind a NULL descriptor, the logic can be simplified
a little bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 11:08:49 +02:00
Samuel Pitoiset 826b3a8773 radv: remove one useless check in radv_bind_descriptor_set()
'set' shouldn't be NULL.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-07-12 11:08:47 +02:00