Commit Graph

1460 Commits

Author SHA1 Message Date
Bas Nieuwenhuizen 2c5b43c87f ac/nir: Fix nir_texop_lod on GFX for 1D arrays.
Fixes: 1bcb953e16 'radv: handle GFX9 1D textures'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-23 00:27:44 +02:00
Dave Airlie da9c3cd3ee radv/ac/nir: only emit tess factors to storage if tes reads them
Otherwise we just need to write them to the tf ring.

this seems to improve the tessellation demo on Bonarie
~2190->~2230 fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-23 07:10:29 +10:00
Bas Nieuwenhuizen 6ce550453f radv: Don't use vgpr indexing for outputs on GFX9.
Due to LLVM bugs. Fixes a bunch of dEQP-VK.glsl.indexing.*
tests.

Fixes: e38685cc62 'Revert "radv: disable support for VEGA for now."'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-22 02:36:37 +02:00
Bas Nieuwenhuizen ad727b96b6 ac/nir: Account for compact array index in GS input load from LDS.
Mirrors the vram path.

Fixes: d4ecc3c929 'ac/nir: Add loading from LDS for merged GS.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 22:29:40 +02:00
Bas Nieuwenhuizen 67648c0faa radv: Don't compile shaders when they are cached already.
When the gs_copy_shader is NULL (due to an incomplete cache), but
the main shaders are found, we still do the nir, but we shouldn't
compile the shaders again. For merged shaders we should also account
for the missing shaders.

Fixes: ce03c119ce 'radv: Add code to compile merged shaders.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 22:29:34 +02:00
Bas Nieuwenhuizen 3bf954b28e radv: Don't check for max GL GS invocations.
We specify 127 instead of 32 as the limit in vulkan.

Fixes: 6bc42855f9 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 22:29:09 +02:00
Bas Nieuwenhuizen 050f7e2df2 radv: Don't explicitly reference vertex shader for draw_id.
With merged shaders the vertex shader may not exist. This got in
because the offending patch was written before merged shaders were
upstream, but committed after.

Fixes: 75dfab24a2 'radv: refactor indirect draws with radv_draw_info'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-21 20:00:22 +02:00
Bas Nieuwenhuizen 20fb15bfe4 radv: Don't reset cmd_buffer->state.dirty.
Otherwise for non-indexed draws we set and immediately unset
RADV_CMD_DIRTY_INDEX_BUFFER. As all the set functions should
clear their own bit, this is unnecessary.

Fixes: 341529dbee 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-21 20:00:16 +02:00
Bas Nieuwenhuizen fb55477990 radv: Correctly detect changed shaders for vertex descriptors.
As they were emitted after the new pipeline, the changed pipeline
detection was not working anymore.

Fixes: 341529dbee 'radv: use optimal packet order for draws'
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-21 19:59:44 +02:00
Bas Nieuwenhuizen 24fe4e6143 ac/nir: Set larged wrokgroup size for GS on GFX9.
They don't take a single wave anymore and we need the barriers.

Fixes: 6bc42855f9 'radv: enable GS on GFX9'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 12:46:44 +02:00
Bas Nieuwenhuizen 9e82f2b3ea ac/nir: Take the max workgroup size of all provided shaders.
Fixes: ffaf4d608a 'radv: Enable tessellation shaders for  GFX9.'
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-21 12:46:28 +02:00
Alex Smith 0fdd531457 radv: Fix pipeline cache locking issues
Need to lock around the whole process of retrieving cached shaders, and
around GetPipelineCacheData.

This fixes GPU hangs observed when creating multiple pipelines in
parallel, which appeared to be due to invalid shader code being pulled
from the cache.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 03:52:43 +02:00
Andres Rodriguez a2c6fbb3ee radv: disable implicit sync for radv allocated bos v3
Implicit sync kicks in when a buffer is used by two different amdgpu
contexts simultaneously. Jobs that use explicit synchronization
mechanisms end up needlessly waiting to be scheduled for long periods
of time in order to achieve serialized execution.

This patch disables implicit synchronization for all radv allocations
except for wsi bos. The only systems that require implicit
synchronization are DRI2/3 and PRIME.

v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC
v3: Add drm version check (Bas)

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:15:54 +02:00
Andres Rodriguez eff2bdbd82 radv: factor out radv_alloc_memory
This allows us to pass extra parameters to the memory allocation
operation that are not defined in the vulkan spec. This is useful for
internal usage.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:15:49 +02:00
Andres Rodriguez 92724338ba radv: Expose VK_EXT_global_priority
Expose the extension string as supported

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Andres Rodriguez 9f7edf4d1f radv: don't skip PS/VS partial flush
This patch helps lower high priority compute latency. Found by
bisecting a perf regression on computeparticles with high priority
compute queues enabled.

Reverting this micro-optimization doesn't seem to have any negative
effect on performance on Dota2 or ssao.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Andres Rodriguez fd04f3eb86 radv: Implement VK_EXT_global_priority
This extension allows the caller to change a queue's system wide
priority. This is useful for applications with specific
latency constraints.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Andres Rodriguez 986c4b0bd4 radv: hardcode shader WAVE_LIMIT to the maximum value
When WAVE_LIMIT is set, a submission will opt-in for SPI based resource
scheduling. Because this mechanism is cooperative, we must ensure that
all submissions have this field set, otherwise they will bypass resource
arbitration.

We always hardcode the field to its maximum value, instead of attempting
to calculate an approximate usage. In testing, there were no benefits to
using anything other than the maximum.

Signed-off-by: Andres Rodriguez <andresx7@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-21 01:01:44 +02:00
Jason Ekstrand 59fb59ad54 nir: Get rid of nir_shader::stage
It's redundant with nir_shader::info::stage.

Acked-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2017-10-20 12:49:17 -07:00
Samuel Pitoiset 341529dbee radv: use optimal packet order for draws
Ported from RadeonSI. The time where shaders are idle should
be shorter now. This can give a little boost, like +6% with
the dynamicubo Vulkan demo.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 20:07:53 +02:00
Samuel Pitoiset af6985b309 radv: add radv_emit_shaders_prefetch()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 20:07:53 +02:00
Samuel Pitoiset 0d85f4a9e2 radv: add radv_emit_shader_prefetch()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 20:07:53 +02:00
Fredrik Höglund e2053b8e3d radv: don't flush the VS when srcStageMask == TOP_OF_PIPE_BIT
The Vulkan specification says:

   "... an execution dependency with only VK_PIPELINE_STAGE_TOP_OF_-
    PIPE_BIT in the source stage mask will effectively not wait for
    any prior commands to complete."

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 11:37:51 +02:00
Samuel Pitoiset 565c22158f radv: mark total_count as MAYBE_UNUSED in CmdSet{Viewport,Scissor}
Fixes two compilation warnings in release build. Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-10-20 11:22:19 +02:00
Samuel Pitoiset c8f2b73656 radv: rename radv_cmd_buffer_flush_state() to radv_draw()
Similar to the dispatch codepath.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:16 +02:00
Samuel Pitoiset 9e45e5c9fd radv: emit primitive restart from radv_emit_draw_registers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:14 +02:00
Samuel Pitoiset 93207a8e89 radv: add radv_emit_draw_registers()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:12 +02:00
Samuel Pitoiset 9466856456 radv: refactor indirect draws (+count buffer) with radv_draw_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:11 +02:00
Samuel Pitoiset 75dfab24a2 radv: refactor indirect draws with radv_draw_info
Indirect draws with a count buffer will be refactored in a
separate patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:08 +02:00
Samuel Pitoiset 03afa95d9f radv: refactor simple and indexed draws with radv_draw_info
Similar to the dispatch compute logic but for draw calls. For
convenience, indirect draws will be converted in a separate
patch.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 11:20:05 +02:00
Samuel Pitoiset 54fa635f82 radv: re-emit VGT_INDEX_TYPE because non-indexed draws overwrite it
Only on CIK and later. We should only update VGT_INDEX_TYPE but
it seems easier to re-emit all the index buffer packets.

Fixes: 966d66f28f (radv: do not re-emit the index buffer for every draw call)
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:40:01 +02:00
Samuel Pitoiset eae46f192e radv: clear the dirty flags in the corresponding emit helpers
This will allow us to fix the VGT_INDEX_TYPE issue properly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:39:28 +02:00
Samuel Pitoiset 68cd3564a0 radv: rename RADV_CMD_DIRTY_RENDER_TARGETS to RADV_CMD_DIRTY_FRAMEBUFFER
To be consistent with the emit function name.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:39:26 +02:00
Samuel Pitoiset 94e69f4141 radv: move DB_COUNT_CONTROL initialization to si_emit_config()
CLEAR_STATE will initialize DB_COUNT_CONTROL to 0 for CIK+.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 10:38:11 +02:00
Bas Nieuwenhuizen 6bc42855f9 radv: enable GS on GFX9
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 07:14:00 +01:00
Bas Nieuwenhuizen 73749caf0e radv: calculate and emit GFX9 GS registers to pipeline state.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:47 +01:00
Bas Nieuwenhuizen 9961ae2447 ac/nir: Fix up GS input vgprs.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:37 +01:00
Bas Nieuwenhuizen d4ecc3c929 ac/nir: Add loading from LDS for merged GS.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:29 +01:00
Bas Nieuwenhuizen ec53e52742 ac/nir: Add ES output to LDS for GFX9.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:18 +01:00
Bas Nieuwenhuizen 3e77333030 ac/nir: Add merged GS function.
[airlied: merged fixup + and fixed up a couple more bits].

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:23:14 +01:00
Bas Nieuwenhuizen f82797b56d radv: Only emit TES when it exists.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:14:14 +01:00
Bas Nieuwenhuizen 6e21b7a294 radv: Use control shader presence for detecting tess.
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:11:10 +01:00
Dave Airlie 5bc5e07d81 radv: fixup tess eval shader when combined.
This fixes some access to the tess eval shader when it's combined
with geometry on gfx9.

This is a review of Bas's commit:
radv: Prevent crashing by accessing TES for VGT reuse depth.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 06:11:10 +01:00
Bas Nieuwenhuizen e6acc20b6a radv: Set VGT_GS_MODE properly for gfx9
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 05:55:11 +01:00
Dave Airlie 99281c1e8f radv: ensure correct outinfo is picked.
This struct used to rely on being in a union, it isn't anymore,
so we have to pick the correct outinfo struct now.

This should fix a regression since the union became a struct.

dEQP-VK.tessellation.geometry_interaction.point_size.vertex_set_geometry_set

Fixes: 6078a3bd51 (ac/nir: Allow ac_shader_variant_info to contain info about multiple stages.)
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 14:44:09 +10:00
Bas Nieuwenhuizen ffaf4d608a radv: Enable tessellation shaders for GFX9.
It mostly works now.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 01:50:43 +02:00
Dave Airlie 1dda214d9c ac/nir: init full exec mask for merged shaders.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-10-20 01:50:40 +02:00
Dave Airlie 14978a1c3b radv: drop unused r600_htile_info.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-20 00:38:57 +01:00
Dave Airlie c8eb3558cc radv: fix CLEAR_STATE packet length.
Looking at shader traces I noticed some registers were missing,
one of them was being eaten by the wrong clear state length.

Fixes: 4f42ea4dc (radv: use CLEAR_STATE for initializing some registers)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-10-19 23:56:48 +01:00
Timothy Arceri 087e010b2b radv: copy indirect lowering settings from radeonsi
It looks the original indirect mask was probably copied from
ANV.

Sascha Willems demo results:

tessellation ~4000 -> ~4200 fps

V2: continue lowering local indirects due to llvm deficiencies.

Tested-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-10-20 08:01:26 +11:00