Commit Graph

1226 Commits

Author SHA1 Message Date
Samuel Pitoiset 3595a11648 radv: create pipeline layout objects for all meta operations
They are dummy objects but the spec requires layout to not be
NULL, this just makes sure we are creating valid pipeline layout
objects. This will allow us to remove some useless checks.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-19 21:22:06 +01:00
Bas Nieuwenhuizen 8b5fe4b2b4 radv: Use a sort for rebuilding the sparse buffer bo list.
It uses slightly more memory (though still bounded by the number
of mapped ranges), but gives less quadratic behavior.

Cuts 4 minutes from the runtime of the CTS *.sparse.* tests.

Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-12-19 21:12:48 +01:00
Bas Nieuwenhuizen 42bc25a79c radv: Advertise sync fd import and export.
Passes dEQP-VK.*.sync_fd.*

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 22:13:31 +01:00
Bas Nieuwenhuizen 52b3f50df8 radv: Implement sync file import/export for fences & semaphores.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 22:13:31 +01:00
Bas Nieuwenhuizen b98bbdf490 radv/amdgpu: wrap sync fd import/export.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 22:13:31 +01:00
Samuel Pitoiset 8d00e63ca8 radv: remove useless radv_cmask_info::base_address_reg
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-18 11:51:11 +01:00
Samuel Pitoiset 79b34d0832 amd/common: add ac_vgt_gs_mode() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-18 11:50:50 +01:00
Samuel Pitoiset 55f8431c76 amd/common: add ac_get_cb_shader_mask() helper
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-18 11:50:48 +01:00
Samuel Pitoiset bb01661918 Revert "radv: do not load unused gl_LocalInvocationID/gl_WorkGroupID components"
This reverts commit 2294d35b24.

We can't do this without adjusting the input SGPRs/VGPRs logic.
For now, just revert it. I will send a proper solution later.

It fixes a rendering issue in F1 2017 that CTS didn't catch up.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-18 11:50:02 +01:00
Dave Airlie 1bdeac545f radv: port merge tess info from anv
anv merges the tess info correctly, but radv wasn't doing this.

This fixes hangs in
dEQP-VK.tessellation.winding.default_domain.hlsl_triangles_ccw

Fixes: 60fc0544e0 (radv/pipeline: handle tessellation shader compilation)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-12-18 18:36:49 +10:00
Bas Nieuwenhuizen d27aaae4d2 radv: Add external fence support.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 09:31:21 +01:00
Bas Nieuwenhuizen 6abfa37879 radv: Implement VK_KHR_external_fence_fd.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 09:31:17 +01:00
Bas Nieuwenhuizen 969421b7da radv: Implement fences based on syncobjs.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 09:31:12 +01:00
Bas Nieuwenhuizen 1c3cda7d27 radv: Add syncobj signal/reset/wait to winsys.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 09:31:02 +01:00
Bas Nieuwenhuizen b42e106d4d radv: Fix multi-layer blits.
We did not set the layer correctly for the dst, as we would keep
using the base layer. Same for the source image.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102710
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-18 01:27:49 +01:00
Samuel Pitoiset 88522e2bcd radv: export SampleMask from pixel shaders at full rate
Use 16_ABGR instead of 32_ABGR if Z isn't written.

Ported from RadeonSI.

No CTS regressions on Polaris.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:23:28 +01:00
Samuel Pitoiset 90c3bf0789 radv: do not load the local invocation index when it's unused
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:22:26 +01:00
Samuel Pitoiset 2294d35b24 radv: do not load unused gl_LocalInvocationID/gl_WorkGroupID components
We should also not load the input SGPRs and VGPRS, but
let's start with this for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:22:06 +01:00
Samuel Pitoiset 5a761167f5 radv: set FORCE_SIMD_DIST(1) for compute when profitable
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:20:59 +01:00
Samuel Pitoiset 75b1c4997f radv: calculate best compute resource limits
Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:20:57 +01:00
Samuel Pitoiset 9fdc1437ba radv: store the dispatch initiator into the device
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:20:55 +01:00
Samuel Pitoiset 97e57740d8 radv: always emit all compute block components
The number of grid components is always 3 when gl_NumWorkGroups
is declared, because it relies on the number of components of
nir_instrinsic_load_num_work_groups.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-14 22:19:39 +01:00
Bas Nieuwenhuizen 4eb0dca46b radv: Don't advertise VK_EXT_debug_report.
We never supported it. Missed during copy and pasting.

Fixes: 17201a2eb0 "radv: port to using updated anv entrypoint/extension generator."
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-12-14 10:05:22 +01:00
Bas Nieuwenhuizen 6469669beb radv: Don't use local BOs when allocating with export options.
If the app does not plan to put a buffer or image in it
(why? But it is allowed and CTS does it), they do not need to
allocate it with the deciate allocation struct.

Fixes: a639d40f13 "radv: add support for local bos. (v3)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-10 23:47:23 +01:00
Samuel Pitoiset 572b2bad1d radv: do not print ASM to stderr when dumping shaders
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:24:24 +01:00
Samuel Pitoiset 33b329f769 radv/winsys: implement query_value()
Might be useful to know the VRAM/GTT usage, the number of VRAM
CPU page faults, etc. Nothing is currently using that new
interface, but it's a first step.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:35 +01:00
Samuel Pitoiset c202119286 radv: remove useless check radv_set_dcc_need_cmask_elim_pred()
emit_fast_color_clear() already checks that.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:03 +01:00
Samuel Pitoiset d90b7a4c50 radv: remove useless checks in radv_set_{color,depth}_clear_regs()
Already checked by the respective callers.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:22:00 +01:00
Samuel Pitoiset c7c7b00889 radv: only re-mit the index type when it changes
dota2 binds a ton of index buffers but the type is always 16-bit.
Note that we have to invalidate the type when switching from
indexed draws to normal draws.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:36 +01:00
Samuel Pitoiset a302009b7b radv: only reset command buffers that are not in the initial state
dota2 always calls vkResetCommandBuffer() before
vkBeginCommandBuffer() which is quite useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:23 +01:00
Samuel Pitoiset a380bc7ecf radv: track different status of a command buffer
RADV_CMD_BUFFER_STATUS_INVALID is not used for now, but I think
it makes sense to declare it. Could be used later with better
command buffer error handling.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:21:21 +01:00
Samuel Pitoiset fc6c77e162 radv: fix TC-compat HTILE with VK_FORMAT_D32_SFLOAT_S8_UINT on Vega
Copied from RadeonSI.

This fixes all CTS
dEQP-VK.renderpass.dedicated_allocation.formats.d32_sfloat_s8_uint.clear.*

And some other ones which use the same format.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-12-08 11:15:44 +01:00
Alex Smith 8fda98c4f1 radv: Add LLVM version to the device name string
Allows apps to determine the LLVM version so that they can decide
whether or not to enable workarounds for LLVM issues.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-12-07 08:58:34 +00:00
Fredrik Höglund b055045378 radv: fix a case statement in GetMemoryFdPropertiesKHR
The handle type in the case statement is supposed to be VK_EXTERNAL_-
MEMORY_HANDLE_TYPE_DMA_BUF_BIT_EXT.

Fixes: 546e747867 ("radv: Implement VK_EXT_external_memory_dma_buf")
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-12-06 20:04:39 +01:00
Samuel Pitoiset 5de7c782fb radv: fix a crash in radv_can_dump_shader()
module can be NULL, oops.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-04 19:52:14 +01:00
Jason Ekstrand 1e565bc6ce radv: Implement VK_KHR_get_surface_capabilities2
The WSI core code does all the hard work.  Just add the wrappers and
turn it on.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand 0a10e3770f vulkan/wsi: Initialize individual WSI interfaces in wsi_device_init
Now that we have anv_device_init/finish functions, there's no reason to
have the individual driver do any more work than that.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand 2e3e55110b vulkan/wsi: Drop some unneeded cruft from the API
This drops the unneeded callbacks struct as well as the queue_get_family
callback we were using before we'd pulled QueuePresent inside.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand c1b1be5196 vulkan/wsi: Add wrappers for all of the surface queries
This lets us move wsi_interface to wsi_common_private.h

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand 82931dc007 vulkan/wsi: Drop the can_handle_different_gpu parameter from get_support
Both anv and radv can handle prime now.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand 516dfb34e1 vulkan/wsi: Add a helper for AcquireNextImage
Unfortunately, due to the fact that AcquireNextImage does not take a
queue, the ANV trick for triggering the fence won't work in general.  We
leave dealing with the fence up to the caller for now.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Dave Airlie 8ff49951c3 vulkan/wsi: move swapchain create/destroy to common code
v2 (Jason Ekstrand):
 - Rebase
 - Alter the names of the helpers to better match the vulkan entrypoints
 - Use the helpers in anv

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand 393aa3f6c9 vulkan/wsi: Move get_images into common code
This moves bits out of all four corners (anv, radv, x11, wayland) and
into the wsi common code.  We also switch to using an outarray to ensure
we get our return code right.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Dave Airlie 6dc3a5e8f0 radv/wsi: Move the guts of QueuePresent to wsi common
v2 (Jason Ekstrand):
 - Better comit message
 - Rebase
 - Re-indent to follow wsi_common style
 - Drop the unneeded _swapchain from the newly added helper
 - Make the clone more true to the original (as per the rebase)

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Dave Airlie 69365d72de radv/wsi: drop allocate memory special case
Just check if image has scanout flag set

v2 (Jason Ekstrand):
 - Rebase
 - Also drop the now unused radv_mem_flag_bits enum

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand e12688f365 vulkan/wsi: Do image creation in common code
This uses the mock extension created in a previous commit to tell the
driver that the image it's just been asked to create is, in fact, a
window system image with whatever assumptions that implies.  There was a
lot of redundant code between the two drivers to do basically exactly
the same thing.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand d50937f137 vulkan/wsi: Implement prime in a completely generic way
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand df4fc68492 radv: Move wsi initialization later in physical_device_init
We need it to happen after memory type setup so that we can query memory
types in wsi_device_init.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand a50f93ecfb radv/image: Implement the wsi "extension"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand 546e747867 radv: Implement VK_EXT_external_memory_dma_buf
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Jason Ekstrand 764fc1643c vulkan/wsi: Add a wsi_device_init function
This gives the opportunity to collect some function pointers if we'd
like which will be very useful in future.

Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Daniel Stone c1163f7b1c vulkan/wsi: Add a wsi_image structure
This is used to hold information about the allocated image, rather than
an ever-growing function argument list.

v2 (Jason Ekstrand):
 - Rename wsi_image_base to wsi_image

Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-12-04 10:04:19 -08:00
Dave Airlie 2cbeb32555 vulkan/wsi: use function ptr definitions from the spec.
This just seems cleaner, and we may expand this in future.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-12-04 10:04:19 -08:00
Timothy Arceri f13790c92f radv: enable nir varying array splitting
Acked-by: Dave Airlie <airlied@redhat.com>
2017-12-04 12:52:18 +11:00
Timothy Arceri aecb9bec87 radv: enable nir component packing
SaschaWillems Vulkan demo tessellation:

~4000fps -> ~4600fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-04 09:10:30 +11:00
Jason Ekstrand e19c623128 spirv: Convert the supported_extensions struct to spirv_options
This is a bit more general and lets us pass additional options into the
spirv_to_nir pass beyond what capabilities we support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2017-12-02 08:09:11 -08:00
Samuel Pitoiset 80e6e71b82 radv: only reset command buffers when the allocation fails
"vkAllocateCommandBuffers can be used to create multiple command
    buffers. If the creation of any of those command buffers fails, the
    implementation must destroy all successfully created command buffer
    objects from this command, set all entries of the pCommandBuffers
    array to NULL and return the error."

This has been suggested by gabriel@system.is.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-01 11:38:34 +01:00
Samuel Pitoiset 921986b580 radv: do not dump meta shaders with RADV_DEBUG=shaders
It's really annoying and this pollutes the output especially
when a bunch of non-meta shaders are compiled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-12-01 11:38:26 +01:00
Samuel Pitoiset ff0f17da14 radv: do not allocate CMASK or DCC for small surfaces
The idea is ported from RadeonSI, but using 512x512 instead of
256x256 seems slightly better. This improves dota2 performance
by +2%.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2017-11-30 21:38:30 +01:00
Samuel Pitoiset f5955c6bf8 radv: do not set DISABLE_LSB_CEIL on GFX9
The state no longer exists on GFX9.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-30 21:38:01 +01:00
Samuel Pitoiset 319f56e675 radv: remove set but unnecessary radv_color_buffer_info::micro_tile_mode
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-30 21:38:00 +01:00
Samuel Pitoiset 4eab78b03c radv: do not store gfx9_epitch in radv_color_buffer_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-30 21:37:58 +01:00
Marek Olšák e3c0a5b6e8 ac/surface: enable DCC computation for MSAA
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-29 18:21:30 +01:00
Nicolai Hähnle 97f42d11df amd/common: sid.h cleanups
Fix a bunch of labels indicating when registers were added/removed
and normalize the SI-class GRBM_GFX_INDEX.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-28 09:34:43 +01:00
Jason Ekstrand 049b84246e radv: Use the suffixed versions of VK_QUEUE_GLOBAL_PRIORITY_*
Acked-by: Dave Airlie <airlied@redhat.com>
2017-11-27 21:42:06 -08:00
Marek Olšák ec15ff78c3 ac: change legacy_surf_level::slice_size to dword units
The next commit will reduce the size even more.

v2: typecast to uint64_t manually
v3: add more typecasts, add asserts

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-27 14:44:04 +01:00
Samuel Pitoiset 1cc00b8e0e Revert "radv: remove unnecessary memset() in radv_AllocateCommandBuffers()"
This fixes two CTS regressions:
- dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_primary
- dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_secondary

These two tests are part the mustpass lists, so presumably they
are correct and my change was wrong.

This reverts commit 0f68208f1d.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-24 12:26:35 +01:00
Samuel Pitoiset dc391a406a radv/winsys: improve error messages when the buffer list creation failed
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-24 11:18:43 +01:00
Samuel Pitoiset 15c0df785b radv/winsys: do not try to create a BO list with 0 buffers
This happens when all BOs have the RADEON_FLAG_NO_INTERPROCESS_SHARING
(DRM version >= 3.23) flag set. This flag is mainly used for reducing
overhead on the userspace side because we don't have to put those BOs
inside the list.

Though, if the driver tries to create a list with 0 buffers inside it,
libdrm returns -EINVAL and the app just crashes.

This fixes a bunch of CTS dEQP-VK.sparse_resources.* fails (~100).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-24 11:18:38 +01:00
Samuel Pitoiset 3a32858fc3 radv: use a 16 bytes array for the sampled/storage image descriptors
This allows to update them with only one memcpy().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-20 11:18:22 +01:00
Samuel Pitoiset bc92ed04ac radv: do not add the query pool BO to the list in vkCmdEndQuery()
As per the spec, the query identified by queryPool and query
must currently be active. Applications have to call vkCmdBeginQuery()
before, and thus the query pool BO will already be in the list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-20 11:18:20 +01:00
Samuel Pitoiset cf54ea155e radv: only load needed depth clear regs for fast depth clears
Similar to how the driver sets the depth clear regs after a
fast depth clear. Most of the time, this will copy a 32-bit reg
instead of a 64-bit reg.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-20 10:45:27 +01:00
Samuel Pitoiset e55b7609fa radv: do not add the image BO in radv_set_depth_clear_regs()
For the fast path, radv_fill_buffer() ensures that the BO is
already in the list. For the slow path, the depth surface is
part of the framebuffer which means the BO is added to the list
when the framebuffer is emitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-20 10:45:23 +01:00
Samuel Pitoiset 3c6bba83f0 radv: remove useless assertion in emit_depthstencil_clear()
Already checked in emit_clear().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-20 10:45:21 +01:00
Samuel Pitoiset 403a3d8061 radv: remove useless check in radv_set_depth_clear_regs()
aspects can't be zero and there is an assertion that ensures
it's not in emit_clear().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-20 10:45:19 +01:00
Dave Airlie 00bf875d55 radv: it isn't an error to not support a format or driver
This reverts two of the vk_error changes:

reporting unsupported format is common,
and testing non-amdgpu drivers and ignoring them is also common.

Fixes: cd64a4f70 (radv: use vk_error() everywhere an error is returned)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-16 06:12:42 +10:00
Samuel Pitoiset 059d25a06d radv: add the vertex buffers BO to the list at bind time
This should reduce the overhead of adding a BO to the current
list, especially when the list is huge. Also, when a new pipeline
is bound, we only need to update the descriptor, the buffer objects
should already be in the list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:01:07 +01:00
Samuel Pitoiset c665879455 radv: replace vb_dirty with RADV_CMD_DIRTY_VERTEX_BUFFER
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:01:05 +01:00
Samuel Pitoiset 8fd213277f radv: drop radv_cmd_dirty_mask_t typedef
I don't think we will need a 64-bit unsigned integer for the
dirty flags in the future, and there is still 20 bits left.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:01:01 +01:00
Samuel Pitoiset f697365058 radv: use an unsigned 32-bit integer for radv_queue::family_index
VkDeviceQueueCreateInfo::queueFamilyIndex is an unsigned 32-bit
integer.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:00:59 +01:00
Samuel Pitoiset f9e1ff2464 radv: do not add the image BO in radv_set_dcc_need_cmask_elim_pred()
radv_fill_buffer() ensures that the image BO is added to the list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:00:57 +01:00
Samuel Pitoiset 40290c805f radv: do not add the image BO in radv_set_color_clear_regs()
radv_fill_buffer() ensures that the image BO is added to the list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-15 09:00:54 +01:00
Samuel Pitoiset 8a7d4092d2 radv: force enable LLVM sisched for The Talos Principle
It seems safe and it improves performance by +4% (73->76).

A drirc based solution is not what we want for now, keep it
simple and improve later if it's really needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-14 15:21:50 +01:00
Samuel Pitoiset ecabe2280c radv: add nosisched debug option
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-14 15:21:48 +01:00
Bas Nieuwenhuizen 7c25578863 radv: Free temporary syncobj after waiting on it.
Otherwise we leak it.

Fixes: eaa56eab6d "radv: initial support for shared semaphores (v2)"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-11-14 10:03:02 +01:00
Bas Nieuwenhuizen 917d3b43f2 radv: Free syncobj with multiple imports.
Otherwise we can leak the old syncobj.

Fixes: eaa56eab6d "radv: initial support for shared semaphores (v2)"
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-11-14 10:03:02 +01:00
Samuel Pitoiset 934b77f2fe radv: add unlikely() around radv_save_descriptors()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:40 +01:00
Samuel Pitoiset 305745457c radv: optimize calling radv_cmd_buffer_trace_emit()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:38 +01:00
Samuel Pitoiset 957d42271b radv: optimize calling radv_save_pipeline()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:36 +01:00
Samuel Pitoiset ebab5c8ff4 radv: use vk_zalloc instead of vk_alloc+memset
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:35 +01:00
Samuel Pitoiset 0f68208f1d radv: remove unnecessary memset() in radv_AllocateCommandBuffers()
This should not be needed, if the allocation fails an error is
returned and the host should handle it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:32 +01:00
Samuel Pitoiset 66da4c75bc radv: remove useless initializations in radv_create_cmd_buffer()
There is a memset() above.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:30 +01:00
Samuel Pitoiset 3d95fde661 radv: remove useless memset() in radv_CreateFence()
All radv_fence fields are initialized here.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:28 +01:00
Samuel Pitoiset cd64a4f705 radv: use vk_error() everywhere an error is returned
For consistency and it might help for debugging purposes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:05:26 +01:00
Samuel Pitoiset 4e16c6a41e radv: make radv_emit_framebuffer_state() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:04:25 +01:00
Samuel Pitoiset be01197d8d radv: do not emit the framebuffer when restoring a pass
Instead just dirty RADV_CMD_DIRTY_FRAMEBUFFER and it will be
re-emitted if necessary before the next draw.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:04:22 +01:00
Samuel Pitoiset f87c58dde3 radv: prefetch VBO descriptors at the right place
Just after the vertex shader.

This seems to give a minor boost for, at least, Serious Sam
Fusion 2017 and Dawn of War 3. I don't see any real impacts
with The Talos Principle.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:03:16 +01:00
Samuel Pitoiset 9444a34f4a radv: add radv_emit_prefetch_TC_L2_async() helper
Will be used for VBO descriptors prefetching.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:03:13 +01:00
Samuel Pitoiset 36c2e46328 radv: rename radv_emit_shaders_prefetch() to radv_emit_prefetch()
For consistency because this function will also prefetch VBO
descriptors.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-13 11:03:11 +01:00
Dave Airlie bec716e844 radv: emit esgs ring size in one place.
This register is the same on all gpus so far, so emit it in one
place and also for the pre-gfx9 gpus set the value in the pipeline
creation.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-13 07:17:09 +00:00
Dave Airlie 031e591923 radv: move calculating vs out info regs into pipeline.
This moves some calculations of register values into the pipeline
construction, it saves looking at outinfo in the cmd buffer emit.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-13 07:16:53 +00:00
Chad Versace 9ec33972cc radv: Fix architecture in radeon_icd.{arch}.json
Use the host arch, not the target arch. In Meson and in recent
Autotools, the host arch is where the binary will be used. The target
arch is useful only when compiling a compiler.

See: http://mesonbuild.com/Cross-compilation.html
See: https://www.gnu.org/software/automake/manual/html_node/Cross_002dCompilation.html
Reported-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-09 16:29:28 -08:00
Marek Olšák 7f33e94e43 amd/addrlib: update to latest version
This uses C++11 initializer lists.

I just overwrote all Mesa files with internal addrlib and discarded
hunks that we should probably keep, but I might have missed something.

The code depending on ADDR_AM_BUILD is removed. We can add it back next
time if needed.

Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-08 00:55:13 +01:00
Dave Airlie 201b3b8d0d radv: move is_local up to the winsys level.
We can avoid adding the buffer in the non-local case, this will
avoid all the overhead of the indirect call.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:45:59 +00:00
Dave Airlie 25660499b6 radv: wrap cs_add_buffer in an inline. (v2)
The next patch will try and avoid calling the indirect function.

v2: add a missing conversion.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:45:59 +00:00
Dave Airlie 31b5da7958 radv: when loading regs no need to add buffer
The function that calls us has just added the buffer to the
list already, no need to try and add it again.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:44:49 +00:00
Dave Airlie 3bf8be41b8 radv: pre-calculate user_data_0 registers and store in pipeline
There's no point recalculating these the whole time on descriptor
emission, just store them at pipeline creation.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 21:44:49 +00:00
Dave Airlie 4bcb48b831 radv: add initial copy descriptor support. (v2)
It appears the latest dota2 vulkan uses this,
and we get a hang in VR mode without it.

v2: remove finishme I left in after finishing.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 19:12:39 +00:00
Dave Airlie 60a9705e00 radv: move descriptor sets out of cmd_state.
Instead of storing all the pointers and zeroing them all out,
just store a valid bitmask in the state. This also moves
the CmdBindPipeline path down the cpu usage path for the
multithreading demo as it no longer has to traverse MAX_SETS
to find the active descriptor sets.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:11:03 +00:00
Dave Airlie 3a0d098252 radv: add helper for setting a descriptor.
This is just a simple refactor.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:11:00 +00:00
Dave Airlie b48063a2f2 radv: move vertex binding out of cmd state.
This isn't required to be cleared, since buffers are only linked
by vertex elements, so if elements are clear then no buffers
should be referenced.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:10:56 +00:00
Dave Airlie 7365626d78 radv: reorder cmd_state to remove a hole.
This just removes a hole in the cmd_state and packs some bools
together.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:10:53 +00:00
Dave Airlie f0ae06a13c radv: free attachments on end command buffer.
If we allocate attachments in the begin command buffer due to the
render pass continue bit, we were leaking them.

Since renderpasses inside a cmd buffer malloc/free these properly,
and set to NULL, we just need to call free at end.

Fixes a memory leak with multithreading demo.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2 17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-06 01:03:47 +00:00
Bas Nieuwenhuizen 608af05ffb radv: Optimize calling radv_save_descriptors.
uint32_t data[MAX_SETS * 2] = {}; was getting executed before
the exit and took significant amounts of time. By having the
check outside the function, we skip the execution of the clear.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-04 20:18:17 +01:00
Bas Nieuwenhuizen cecbcf4b2d radv: Use an array to store descriptor sets.
The vram_list linked list resulted in lots of pointer chasing.
Replacing this with an array instead improves descriptor set
allocation CPU usage by 3x at least (when also considering the free),
because it had to iterate through 300-400 sets on average.

Not a huge improvement as the pre-improvement CPU usage was only
about 2.3% in the busiest thread.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-11-04 20:18:17 +01:00
Samuel Pitoiset bad31f6a65 radv: use the optimal packets order for dispatch calls
This should reduce the time where compute units are idle, mainly
for meta operations because they use a bunch of compute shaders.

This seems to have a really minor positive effect for Talos, at least.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-02 23:03:59 +01:00
Bas Nieuwenhuizen 806721429a radv: Don't expose heaps with 0 memory.
It confuses CTS. This pregenerates the heap info into the
physical device, so we can use it for translating contiguous
indices into our "standard" ones.

This also makes the WSI a bit smarter in case the first preferred
heap does not exist.

Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
2017-11-02 20:28:19 +01:00
Samuel Pitoiset c39f39106d radv: make radv_bind_descriptor_set() static
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-11-02 09:36:14 +01:00
Dave Airlie 799ef80059 radv: make sure we set buffers as shareable properly.
This should make sure we don't treat exports buffers as local
bos.

Fixes: a639d40f13 (radv: add support for local bos. (v3))

Tested-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Andres Rodriguez <andresx7@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-11-02 01:01:29 +00:00
Samuel Pitoiset 5010436e09 radv: bail out when binding the same vertex buffers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-31 10:16:38 +01:00
Samuel Pitoiset 11fdc2cd34 radv: bail out when binding the same index buffer
DOW3 appears to hit this path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-31 10:16:35 +01:00
Timothy Arceri e92405c55a radv: use correct alloc function when loading from disk
Fixes regression in:

dEQP-VK.api.object_management.alloc_callback_fail.graphics_pipeline

Fixes: 1e84e53712 "radv: add cache items to in memory cache when reading from disk"
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-31 14:51:55 +11:00
Alex Smith 134a40d2a6 radv: Fix -Wformat-security issue
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103513
Fixes: de88979413 ("radv: Implement VK_AMD_shader_info")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-30 10:58:56 +01:00
Timothy Arceri 1e84e53712 radv: add cache items to in memory cache when reading from disk
Otherwise we will leak them, load duplicates from disk rather
than memory and never write items loaded from disk to the apps
pipeline cache.

Fixes: fd24be134f 'radv: make use of on-disk cache'
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-30 17:49:54 +11:00
Alex Smith de88979413 radv: Implement VK_AMD_shader_info
This allows an app to query shader statistics and get a disassembly of
a shader. RenderDoc git has support for it, so this allows you to view
shader disassembly from a capture.

When this extension is enabled on a device (or when tracing), we now
disable pipeline caching, since we don't get the shader debug info when
we retrieve cached shaders.

v2: Improvements to resource usage reporting
v3: Disassembly string must be null terminated (string_buffer's length
    does not include the terminator)
v4: Fixed LDS reporting. (Bas)

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-29 00:28:45 +02:00
Samuel Pitoiset a41e2e9cf5 radv: allow to use a compute shader for resetting the query pool
Serious Sam Fusion 2017 uses a huge number of occlusion queries,
and the allocated query pool buffer is greater than 4096 bytes.

This slightly improves performance (tested in Ultra) from
117.2 FPS to 119.7 FPS (~+2%) on my RX480.

This also improves Talos, from 69 FPS to 72/73 FPS (~+5%).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-27 13:47:06 +02:00
Samuel Pitoiset 0d61109bb7 radv: make radv_fill_buffer() return the needed flush bits
Only needed when the CS path is used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-27 13:47:03 +02:00
Dave Airlie a639d40f13 radv: add support for local bos. (v3)
This uses the new kernel interfaces for reduced cs overhead,
We only set the local flag for memory allocations that don't have
 a dedicated allocation and ones that aren't imports.

v2: add to all the internal buffer creation paths.
v3: missed some command submission paths, handle 0/empty bo lists.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-26 23:59:28 +01:00
Samuel Pitoiset 06a12f250f radv: only copy the dynamic states that changed
When binding a new pipeline, we applied all dynamic states
without checking if they really need to be re-emitted. This
doesn't seem to be useful for the meta operations because only
the viewports/scissors are updated.

This should reduce the number of commands added to the IB
when a new graphics pipeline is bound.

Also, rename radv_dynamic_state_copy() to radv_bind_dynamic_state()
and set the dirty flags directly there.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-26 09:37:05 +02:00
Samuel Pitoiset b1e31c1911 radv: store the dynamic state mask into radv_dynamic_state
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-26 09:37:03 +02:00
Samuel Pitoiset 672cf692fb radv: only emit the depth bounds test values when set dynamically
The depth bounds test values are either set at pipeline
creation or dynamically using vkCmdSetDepthBounds().

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-26 09:37:00 +02:00
Bas Nieuwenhuizen 61a9ef4ab1 radv: Compute ac keys from pipeline key.
The beginning of the end for the shader keys. Not entirely sure
what I'm going to replace them with for the compiler though, so this
is the first step.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-26 00:28:40 +02:00
Bas Nieuwenhuizen 49d035122e radv: Add single pipeline cache key.
To decouple the key used for info gathering and the cache from
whatever we pass to the compiler.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-26 00:28:40 +02:00
Bas Nieuwenhuizen de38491a57 radv: Don't compute as_ls/as_es before hashing.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-26 00:28:40 +02:00
Samuel Pitoiset 9711979df0 radv: print NIR before LLVM IR and disassembly
It's still printed after linking, but it makes more sense to
have SPIRV->NIR->LLVM IR->ASM.

Fixes: f0a2bbd1a4 (radv: move nir print after linking is done)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 11:46:53 +02:00
Bas Nieuwenhuizen 5bfbab2fdc radv: Fix truncation issue hexifying the cache uuid for the disk cache.
Going from binary to hex has a 2x blowup.

Fixes: 1421625292 'radv: create on-disk shader cache'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-25 09:50:05 +02:00
Timothy Arceri 767ca5bdf1 radv: enable lower to scalar nir pass
This will allow dead components of varyings to be removed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-25 17:02:40 +11:00
Dave Airlie d8cefaa197 radv: use device name in cache creation like radeonsi.
Not sure how useful this is, but it makes it more consistent.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.3" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-25 02:26:01 +01:00
Dave Airlie 3cd3035ace radv: use a define for the transition point between cp and compute shader
For certain buffer meta ops we can use the CP or a compute shader,
we should use a define to rather than hardcoding 4096, allows
for easier testing and more consistency.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-25 10:01:13 +10:00
Dave Airlie a5499b639c radv: only emit dfsm packets if dfsm is allowed.
radeonsi only emits these when dfsm is enabled, so for now
just hinge them on a flag we never set.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-24 23:00:57 +01:00
Timothy Arceri f0a2bbd1a4 radv: move nir print after linking is done
We now have linking optimisations so we want to delay dumping the
nir until after these are complete.

Fixes: 06f05040eb (radv: Link shaders)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-24 10:41:38 +11:00
Timothy Arceri 013313cf89 radv: clone meta shaders before linking
The IR is reused in different pipeline combinations so we need
to clone it to avoid link time optimistaions messing up the
original copy.

Fixes: 06f05040eb (radv: Link shaders)

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-24 09:27:40 +11:00
Alex Smith fee9d05e21 radv: Update code pointer correctly if a variant is already created
This was the actual cause of GPU hangs fixed by 0fdd531457 ("radv:
Fix pipeline cache locking issues"), since multiple threads would end
up trying to create the variants for a single entry.

Now that we're locking around the whole of this function, this isn't
really necessary (we either create all or none of the variants), but
fix this anyway in case things change later.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
CC: 17.3 <mesa-stable@lists.freedesktop.org>
2017-10-23 22:36:54 +02:00
Juan A. Suarez Romero 2665d012a8 radv: automake: include radv_extensions.py in the tarball
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2017-10-23 12:37:01 +02:00
Bas Nieuwenhuizen c07d719e8b radv: Disallow indirect outputs for GS on GFX9 as well.
Since it also uses the output vector before writing to memory.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-10-23 00:27:44 +02:00
Dave Airlie da9c3cd3ee radv/ac/nir: only emit tess factors to storage if tes reads them
Otherwise we just need to write them to the tf ring.

this seems to improve the tessellation demo on Bonarie
~2190->~2230 fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-23 07:10:29 +10:00
Bas Nieuwenhuizen 6ce550453f radv: Don't use vgpr indexing for outputs on GFX9.
Due to LLVM bugs. Fixes a bunch of dEQP-VK.glsl.indexing.*
tests.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-22 02:36:37 +02:00
Bas Nieuwenhuizen 67648c0faa radv: Don't compile shaders when they are cached already.
When the gs_copy_shader is NULL (due to an incomplete cache), but
the main shaders are found, we still do the nir, but we shouldn't
compile the shaders again. For merged shaders we should also account
for the missing shaders.

Fixes: ce03c119ce 'radv: Add code to compile merged shaders.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 22:29:34 +02:00
Bas Nieuwenhuizen 3bf954b28e radv: Don't check for max GL GS invocations.
We specify 127 instead of 32 as the limit in vulkan.

Fixes: 6bc42855f9 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 22:29:09 +02:00
Bas Nieuwenhuizen 050f7e2df2 radv: Don't explicitly reference vertex shader for draw_id.
With merged shaders the vertex shader may not exist. This got in
because the offending patch was written before merged shaders were
upstream, but committed after.

Fixes: 75dfab24a2 'radv: refactor indirect draws with radv_draw_info'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-21 20:00:22 +02:00
Bas Nieuwenhuizen 20fb15bfe4 radv: Don't reset cmd_buffer->state.dirty.
Otherwise for non-indexed draws we set and immediately unset
RADV_CMD_DIRTY_INDEX_BUFFER. As all the set functions should
clear their own bit, this is unnecessary.

Fixes: 341529dbee 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-21 20:00:16 +02:00
Bas Nieuwenhuizen fb55477990 radv: Correctly detect changed shaders for vertex descriptors.
As they were emitted after the new pipeline, the changed pipeline
detection was not working anymore.

Fixes: 341529dbee 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-21 19:59:44 +02:00
Alex Smith 0fdd531457 radv: Fix pipeline cache locking issues
Need to lock around the whole process of retrieving cached shaders, and
around GetPipelineCacheData.

This fixes GPU hangs observed when creating multiple pipelines in
parallel, which appeared to be due to invalid shader code being pulled
from the cache.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 03:52:43 +02:00
Andres Rodriguez a2c6fbb3ee radv: disable implicit sync for radv allocated bos v3
Implicit sync kicks in when a buffer is used by two different amdgpu
contexts simultaneously. Jobs that use explicit synchronization
mechanisms end up needlessly waiting to be scheduled for long periods
of time in order to achieve serialized execution.

This patch disables implicit synchronization for all radv allocations
except for wsi bos. The only systems that require implicit
synchronization are DRI2/3 and PRIME.

v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC
v3: Add drm version check (Bas)

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:15:54 +02:00
Andres Rodriguez eff2bdbd82 radv: factor out radv_alloc_memory
This allows us to pass extra parameters to the memory allocation
operation that are not defined in the vulkan spec. This is useful for
internal usage.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:15:49 +02:00
Andres Rodriguez 92724338ba radv: Expose VK_EXT_global_priority
Expose the extension string as supported

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Andres Rodriguez 9f7edf4d1f radv: don't skip PS/VS partial flush
This patch helps lower high priority compute latency. Found by
bisecting a perf regression on computeparticles with high priority
compute queues enabled.

Reverting this micro-optimization doesn't seem to have any negative
effect on performance on Dota2 or ssao.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Andres Rodriguez fd04f3eb86 radv: Implement VK_EXT_global_priority
This extension allows the caller to change a queue's system wide
priority. This is useful for applications with specific
latency constraints.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Andres Rodriguez 986c4b0bd4 radv: hardcode shader WAVE_LIMIT to the maximum value
When WAVE_LIMIT is set, a submission will opt-in for SPI based resource
scheduling. Because this mechanism is cooperative, we must ensure that
all submissions have this field set, otherwise they will bypass resource
arbitration.

We always hardcode the field to its maximum value, instead of attempting
to calculate an approximate usage. In testing, there were no benefits to
using anything other than the maximum.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Jason Ekstrand 59fb59ad54 nir: Get rid of nir_shader::stage
It's redundant with nir_shader::info::stage.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-20 12:49:17 -07:00
Samuel Pitoiset 341529dbee radv: use optimal packet order for draws
Ported from RadeonSI. The time where shaders are idle should
be shorter now. This can give a little boost, like +6% with
the dynamicubo Vulkan demo.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 20:07:53 +02:00
Samuel Pitoiset af6985b309 radv: add radv_emit_shaders_prefetch()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 20:07:53 +02:00
Samuel Pitoiset 0d85f4a9e2 radv: add radv_emit_shader_prefetch()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 20:07:53 +02:00
Fredrik Höglund e2053b8e3d radv: don't flush the VS when srcStageMask == TOP_OF_PIPE_BIT
The Vulkan specification says:

   "... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_-
    PIPE_BIT in the source stage mask will effectively not wait for
    any prior commands to complete."

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 11:37:51 +02:00
Samuel Pitoiset 565c22158f radv: mark total_count as MAYBE_UNUSED in CmdSet{Viewport,Scissor}
Fixes two compilation warnings in release build. Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 11:22:19 +02:00
Samuel Pitoiset c8f2b73656 radv: rename radv_cmd_buffer_flush_state() to radv_draw()
Similar to the dispatch codepath.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:16 +02:00
Samuel Pitoiset 9e45e5c9fd radv: emit primitive restart from radv_emit_draw_registers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:14 +02:00
Samuel Pitoiset 93207a8e89 radv: add radv_emit_draw_registers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:12 +02:00
Samuel Pitoiset 9466856456 radv: refactor indirect draws (+count buffer) with radv_draw_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:11 +02:00
Samuel Pitoiset 75dfab24a2 radv: refactor indirect draws with radv_draw_info
Indirect draws with a count buffer will be refactored in a
separate patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:08 +02:00
Samuel Pitoiset 03afa95d9f radv: refactor simple and indexed draws with radv_draw_info
Similar to the dispatch compute logic but for draw calls. For
convenience, indirect draws will be converted in a separate
patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:05 +02:00
Samuel Pitoiset 54fa635f82 radv: re-emit VGT_INDEX_TYPE because non-indexed draws overwrite it
Only on CIK and later. We should only update VGT_INDEX_TYPE but
it seems easier to re-emit all the index buffer packets.

Fixes: 966d66f28f (radv: do not re-emit the index buffer for every draw call)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:40:01 +02:00
Samuel Pitoiset eae46f192e radv: clear the dirty flags in the corresponding emit helpers
This will allow us to fix the VGT_INDEX_TYPE issue properly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:39:28 +02:00
Samuel Pitoiset 68cd3564a0 radv: rename RADV_CMD_DIRTY_RENDER_TARGETS to RADV_CMD_DIRTY_FRAMEBUFFER
To be consistent with the emit function name.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:39:26 +02:00
Samuel Pitoiset 94e69f4141 radv: move DB_COUNT_CONTROL initialization to si_emit_config()
CLEAR_STATE will initialize DB_COUNT_CONTROL to 0 for CIK+.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:38:11 +02:00
Bas Nieuwenhuizen 6bc42855f9 radv: enable GS on GFX9
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 07:14:00 +01:00
Bas Nieuwenhuizen 73749caf0e radv: calculate and emit GFX9 GS registers to pipeline state.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:47 +01:00
Bas Nieuwenhuizen f82797b56d radv: Only emit TES when it exists.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:14:14 +01:00
Bas Nieuwenhuizen 6e21b7a294 radv: Use control shader presence for detecting tess.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:11:10 +01:00
Dave Airlie 5bc5e07d81 radv: fixup tess eval shader when combined.
This fixes some access to the tess eval shader when it's combined
with geometry on gfx9.

This is a review of Bas's commit:
radv: Prevent crashing by accessing TES for VGT reuse depth.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:11:10 +01:00
Bas Nieuwenhuizen e6acc20b6a radv: Set VGT_GS_MODE properly for gfx9
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 05:55:11 +01:00
Dave Airlie 99281c1e8f radv: ensure correct outinfo is picked.
This struct used to rely on being in a union, it isn't anymore,
so we have to pick the correct outinfo struct now.

This should fix a regression since the union became a struct.

dEQP-VK.tessellation.geometry_interaction.point_size.vertex_set_geometry_set

Fixes: 6078a3bd51 (ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 14:44:09 +10:00
Bas Nieuwenhuizen ffaf4d608a radv: Enable tessellation shaders for GFX9.
It mostly works now.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 01:50:43 +02:00
Dave Airlie 14978a1c3b radv: drop unused r600_htile_info.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 00:38:57 +01:00
Dave Airlie c8eb3558cc radv: fix CLEAR_STATE packet length.
Looking at shader traces I noticed some registers were missing,
one of them was being eaten by the wrong clear state length.

Fixes: 4f42ea4dc (radv: use CLEAR_STATE for initializing some registers)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-19 23:56:48 +01:00
Timothy Arceri 087e010b2b radv: copy indirect lowering settings from radeonsi
It looks the original indirect mask was probably copied from
ANV.

Sascha Willems demo results:

tessellation ~4000 -> ~4200 fps

V2: continue lowering local indirects due to llvm deficiencies.

Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 08:01:26 +11:00
Timothy Arceri 5549b47d7b radv: stop redundant setting of active_stages
We already set it when above in the nir compilation loop.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 08:01:26 +11:00
Bas Nieuwenhuizen 228325f4b7 radv: Modify rsrc1/rsrc2 generation for merged tess.
No OC_LDS_EN for HS, and the included LS vgpr_comp_cnt is at
a different offset.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:44 +02:00
Bas Nieuwenhuizen 8250efb90a radv: Set correct registers for merged shader rings.
We need different regs to end up in s0/s1.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:39 +02:00
Bas Nieuwenhuizen 6a074f87be radv: Add GFX9 HS emitting code.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:34 +02:00
Bas Nieuwenhuizen b096245030 radv: Remove remaining hard coded references to VS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:31 +02:00
Bas Nieuwenhuizen 91b033f4f6 radv: Update GFX9 user data regs for GS/tess.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:27 +02:00
Bas Nieuwenhuizen ce03c119ce radv: Add code to compile merged shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:25:23 +02:00
Bas Nieuwenhuizen a996ed1f9b ac/nir: Change interface to allow multiple source shaders.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-19 22:24:47 +02:00
Samuel Pitoiset 535aa43df0 radv: reset dirty flags after flushing all states
Move it to radv_cmd_buffer_flush_state() because if
rasterizerDiscardEnable is true, the flags are not cleared.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 21:21:48 +02:00
Samuel Pitoiset 966d66f28f radv: do not re-emit the index buffer for every draw call
It can only be changed when CmdBindIndexBuffer() is called
or when a secondary buffer is used. Though not always, but
let's re-emit the packets in this situation for now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 21:21:43 +02:00
Samuel Pitoiset e5480be0d1 radv: remove useless mask operation in radv_cs_emit_draw_indexed_packet()
This saves few CPU cycles when CmdDrawIndexed() is used a lot.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 21:21:30 +02:00
Bas Nieuwenhuizen fa226e9933 radv: Do not read from the disk cache with RADV_DEBUG=nocache.
Otherwise the flag is borderline useless.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-18 20:37:10 +02:00
Alex Smith 2cccc74f56 radv: Set active_stages after getting cached shaders
Fixes: 7d45d22fdd ("radv: switch to using radv_create_shaders()")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 20:37:10 +02:00
Alex Smith f557673237 radv: Don't free NIR shaders if tracing
Fixes a crash while generating a hang report.

Fixes: 7d45d22fdd ("radv: switch to using radv_create_shaders()")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-18 20:37:10 +02:00