Commit Graph

307 Commits

Author SHA1 Message Date
Bas Nieuwenhuizen 2e86f6b259 radv: Add multiview clears.
v2: Use for_each_bit.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-24 19:20:47 +02:00
Dave Airlie 9c080100d3 radv/gfx9: emit sx_mrt_blend registers
GFX9 needs the SX MRT blend registers programmed, port over
the code from radeonsi to workout the values from the blend
state, and program the registers on rbplus systems.

This fixes lots of:
dEQP-VK.pipeline.blend.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:14 +01:00
Dave Airlie 864eb18527 radv: bump space check for indexed draw.
For the GFX9 packet we need one more dword.

Fixes an assert in:
dEQP-VK.draw.shader_draw_parameters.base_vertex.draw_indexed

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:14 +01:00
Dave Airlie d987b4ab9e radv/gfx9: fixup db/stencil disable.
This fixes disabled Z/stencil.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-24 01:14:14 +01:00
David Airlie 674ecbfef2 radv: emit db_htile_surface reg on gfx9 as well
This is also a GFX9 register.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.2" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-16 05:54:09 +10:00
Dave Airlie 82ba384c10 radv: force cs/ps/l2 flush at end of command stream. (v2)
This seems like a workaround, but we don't see the bug on CIK/VI.

On SI with the dEQP-VK.memory.pipeline_barrier.host_read_transfer_dst.*
tests, when one tests complete, the first flush at the start of the next
test causes a VM fault as we've destroyed the VM, but we end up flushing
the compute shader then, and it must still be in the process of doing
something.

Could also be a kernel difference between SI and CIK.

v2: hit this with a bigger hammer. This fixes a bunch of hangs
in the vk cts with the robustness tests.

Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-08-09 23:19:15 +01:00
Bas Nieuwenhuizen c9d4b571ad radv: Add suballocation for shaders.
This reduces the number of BOs that we need for the BO lists during
a submission.

Currently uses a fairly simple linear search for finding free space,
that could eventually be improved to a binary tree, which with some
per-node info could make a check for space O(1) and finding it O(log n),
in the number of buffers in that slab.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-08-03 00:45:13 +02:00
Nicolai Hähnle 7de445377c ac/nir,radv: move force_persample to ac_shader_info::force_persample
Avoid accessing radv-specific structures during the meat of NIR-to-LLVM
translation.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-07-31 14:55:43 +02:00
Bas Nieuwenhuizen ea08a296fe radv: Handle VK_ATTACHMENT_UNUSED in color attachments.
This just sets them to INVALID COLOR,  instead of shifting the
attachments together.

This also fixes a number of cases where we use it first and only
then check if it is VK_ATTACHMENT_UNUSED.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-07-24 01:50:52 +02:00
Dave Airlie 9ee67467c9 radv: predicate cmask eliminate when using DCC.
When using DCC some clear values don't require a cmask eliminate
step. This patch adds support for black and black with alpha 1,
there are other values, but I don't have access to a comprehensive list.

This works by setting the cmask eliminate predicate when doing the
fast clear, and later when doing the cmask elimination making sure
the draws are predicated.

This increases the fps on Sascha Willems deferred.

Tonga: 580fps->670fps on a Tonga PRO card.
Polaris 730->850fps

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:44:43 +01:00
Dave Airlie b86f86f55c radv: allow clear merging for depth/stencil with no care stencil
Some of the Sascha Willems demos pick a D32/S8 format for the depth
buffer, then do a LOAD_OP_CLEAR/LOAD_OP_DONT_CARE on it, which means
we don't get to merge the undefined->depth and clear htile transitions.

This add the stencil aspect to the pending clears if there is a depth
clear pending and the stencil aspect is don't care.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-17 01:16:59 +01:00
Dave Airlie a6c2001ace radv: add support for cmd predication.
This doesn't get used yet, it just adds support to various PKT3
emissions to enable it later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-07-06 02:06:49 +01:00
Dave Airlie 6a68170c83 radv: handle primitive id input into fragment shader with no geom shader
Fixes:
dEQP-VK.pipeline.framebuffer_attachment.no_attachments
dEQP-VK.pipeline.framebuffer_attachment.no_attachments_ms

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-26 08:45:30 +10:00
Dave Airlie 9cce302951 radv: move assert down in radv_bind_descriptor_set
coverity complains about the deref before NULL check.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-13 10:01:36 +10:00
Grazvydas Ignotas f56aa25ac5 radv: don't even attempt to prefetch on SI
Before bcae327469 this was emitting CP DMA packet even on SI, but
apparently hasn't caused too many problems. After that commit the
CP DMA code now always sets the CIK+ only bit for prefetch. Just
follow radeonsi there and don't try to prefetch at all.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101334
Fixes: bcae327469 "radv: realign cp dma code with radeonsi"
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-06-11 14:28:40 +03:00
Dave Airlie 00fe30f376 radv: move lots of index related things into the bind.
This just moves lots of stuff to the bind stage rather than
dealing with it in the draw stage.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 10:24:37 +10:00
Dave Airlie 734ea16bdb radv: move calculating the vertex sgpr to the pipeline.
There is no need to calculate this at draw time.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 10:24:36 +10:00
Dave Airlie 3f48021b86 radv: rename and make global some functions.
I want to use these in the pipeline setup stage.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-07 10:24:36 +10:00
Bas Nieuwenhuizen d607b83b79 radv: Split out updating the vertex descriptors.
Simple refactor.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-06 23:23:43 +02:00
Bas Nieuwenhuizen 58c8aae241 radv: Move pipeline stuff from flush_state to emit_graphics_pipeline.
No functional changes.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-06 23:23:43 +02:00
Bas Nieuwenhuizen 4ec89727b2 radv: Remove vertex_descriptors_dirty.
Redundant.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-06 23:23:43 +02:00
Bas Nieuwenhuizen fe0b8d1e8b radv: Don't use a divide by index_size.
Divides are pretty slow, and this is in the hot path of a draw.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-06-06 23:23:43 +02:00
Dave Airlie 0e72dea46f radv: fix write event eop on vega.
Typo here, fixes command submission hangs on vega
2017-06-06 10:43:19 +10:00
Dave Airlie 348f63623b radv: misc GFX9 changes.
These are just some register changes ported from radeonsi for gfx9.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:44:10 +10:00
Dave Airlie 289de9f945 radv: add some GFX9 specific events.
These are ported from radeonsi, don't know all the rules for
when they should be inserted.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:44:00 +10:00
Dave Airlie 5c8f8cae3e radv: add IA_MULTI_VGT_PARAM support for GFX9.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:55 +10:00
Dave Airlie 67655cb24f radv: add rb+ support for GFX9
This adds some rb+ support, as on GFX9 we have to disable
it as per radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:45 +10:00
Dave Airlie c2fbeb7ca0 radv: add GFX9 cache flushing support.
GFX9 needs to write event EOP to a fence buffer, allocate some
space for this, and just write an ever increasing number to it,
this isn't exactly what radeonsi does, but it seems to work.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:40 +10:00
Dave Airlie 87b3799493 radv: add GFX9 to initialisation cmd buffer.
This just adds support for initialising some GFX9 registers,
and handles the different init for the VGT reuse reg.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:35 +10:00
Dave Airlie 41eba750ba radv: add gfx9 depth/stencil surface support.
This is ported from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:27 +10:00
Dave Airlie ac3e18916f radv: add GFX9 support for color surfaces.
This is ported from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:24 +10:00
Dave Airlie 0063da8393 radv: add some misc gfx9 pieces.
This just adds the strings and includes the gfx9 register defs
in some files that we need them in.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 09:43:21 +10:00
Dave Airlie b50ab49723 radv: use radv_foreach_stage in a couple of places.
This just collapses a few per-stage things into a loop,
shouldn't affect anything.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-06 08:20:22 +10:00
Bas Nieuwenhuizen 4415a46be2 radv: Dirty all descriptors sets when changing the pipeline.
Sets could have been ignored during previous descriptor set flush
due to the shader not using them and therefore no SGPR being assigned.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: ae61ddabe8 "radv: move userdata sgpr ownership to compiler side."
2017-06-03 22:24:37 +02:00
Bas Nieuwenhuizen 5fb8bb3065 radv: Set both compute and graphics SGPRS on descriptor set flush.
We clear the descriptors_dirty array afterwards, so the SGPRs for
the other pipeline don't get updated on the flush for that other
draw/dispatch, so we have to make sure we do it immediately.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: ae61ddabe8 "radv: move userdata sgpr ownership to compiler side."
2017-06-03 22:24:37 +02:00
Dave Airlie ad61eac250 radv: factor out eop event writing code. (v2)
In prep for GFX9 refactor some of the eop event writing code
out.

This changes behaviour, but aligns with what radeonsi does,
it does double emits on CIK/VI, whereas previously it only
did this on CIK.

v2: bump the size checks.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-02 12:48:56 +10:00
Dave Airlie 7205431e73 radv: factor out si_emit_wait_fence code.
This code was in a few places, consolidate into one.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-02 12:48:20 +10:00
Bas Nieuwenhuizen af2844116f radv: Revert HTILE reset word to 0xFFFFFFFF.
0x30f regressed mad max.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Acked-by: Dave Airlie <airlied@redhat.com>
Fixes: df91abfe5a "radv: Use correct clear words for HTILE."
2017-05-31 23:55:13 +02:00
Bas Nieuwenhuizen 18efb404cf radv: Reserve space for descriptor and push constant user SGPR setting.
flush_compute_state doesn't reserve a large chunk, so these need their own reservation.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: f4e499ec79 "radv: add initial non-conformant radv vulkan driver"
2017-05-29 22:30:39 +02:00
Bas Nieuwenhuizen df91abfe5a radv: Use correct clear words for HTILE.
Did some RE'ing what several HTILE words give when read from a descriptor
with HTILE compression enabled.

Seems to align with -pro usage for D16 too.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-22 20:07:21 +02:00
Bas Nieuwenhuizen 0b26f0ee4f radv: Add queue masks for htile usage determination.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-22 20:07:21 +02:00
Bas Nieuwenhuizen 0628580eff radv: Specify semantics of HTILE layout helpers.
And correct implementation to specify only what we support.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-22 20:07:21 +02:00
Bas Nieuwenhuizen 62e182acd0 radv: Don't use a separate can_expclear.
We never use EXPCLEAR clears.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-05-22 20:07:21 +02:00
Dave Airlie 823e9ea8a1 radv: drop resolve hack workarounds
This drops the resolve workarounds that change an image
tiling mode behinds it's back, this is horrible and breaks
the image_view->image relationship. Remove all this.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-07 23:41:39 +01:00
Fredrik Höglund 5ff4858111 radv/meta: fix restoring a push descriptor set
radv_bind_descriptor_set cannot be used to bind a push descriptor set
since a push descriptor set does not have a buffer list. However,
there is no need to add the buffers again when restoring a set, so
this fix is also an optimization.

Cc: "17.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-05-06 01:46:18 +02:00
Bas Nieuwenhuizen 9e847eedd5 radv: Don't set dynamic state for pipelines with rasterizer dicard.
All of the dynamic states apply to rasterization & fragment processing,
so we don't need to set them if we don't rasterize.

We don't clear the dirty flags for them though, so we don't miss any
updates for the next pipeline with rasterization.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes: 76603aa90b "radv: Drop the default viewport when 0 viewports are given."
2017-05-03 00:12:56 +02:00
Dave Airlie 052487be4c radv: remove some members of radeon surface.
We would be storing this info twice per image, no need to,
remove it from the surface struct.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 06:00:35 +10:00
Dave Airlie 7e8d0a402b radv: move some image info into a separate struct.
This is to rework the surface code like radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-05-03 06:00:17 +10:00
Bas Nieuwenhuizen e137b9eed9 radv: Use the correct pipeline for dispatches.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Fixes: ec15e0d30 "radv: optimise compute shader grid size emission."
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-22 20:26:59 +01:00
Bas Nieuwenhuizen 0e91d8f38c radv: Prefetch compute shader too.
For consistency, doesn't really impact performance.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-21 00:59:02 +02:00
Bas Nieuwenhuizen 1e1165389c radv: Add shader prefetch.
Gives me approximately a 2% perf increase in bot dota2 & talos.

Having descriptors (both sets and vertex buffers) prefetched
didn't help so I didn't include that.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-19 23:47:27 +02:00
Dave Airlie fd420a7417 radv: add support for 32 descriptor sets.
This bumps the limit to the number of sets to 32, now that
we have proper support for it. It also uses 1u in a few places
to make things a bit safer.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:43 +10:00
Dave Airlie 25a5ee391d radv/ac: add support for indirect access of descriptor sets.
We want to expose more descriptor sets to the applications,
but currently we have a 1:1 mapping between shader descriptor
sets and 2 user sgprs, limiting us to 4 per stage. This commit
check if we don't have enough user sgprs for the number of
bound sets for this shader, we can ask for them to be indirected.

Two sgprs are then used to point to a buffer or 64-bit pointers
to the number of allocated descriptor sets. All shaders point
to the same buffer.

We can use some user sgprs to inline one or two descriptor sets
in future, but until we have a workload that needs this I don't
 think we should spend too much time on it.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:43 +10:00
Dave Airlie ec15e0d301 radv: optimise compute shader grid size emission.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:42 +10:00
Dave Airlie 31174069d2 radv: start conditionalising vertex inputs. (v2)
In practice this will probably just drop draw id in a few places.

v2: just do draw_id for now. (Bas)
it might be possible to do something more if we need it in the
future. (nha)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:42 +10:00
Dave Airlie 224cf2906a radv/ac: add initial pre-pass for shader info gathering
There is some radv specific info we need to gather from shaders
before we get into converting nir->llvm, so we can make
better decisions especially around user sgpr allocation.

This is just an initial placeholder to gather if sample positions
are required in the frag shader.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-19 09:00:42 +10:00
Fredrik Höglund f95caae504 radv: add private push descriptors for meta
This allows meta to use push descriptors without disturbing user
push descriptors.

radv_meta_push_descriptor_set differs from vkCmdPushDescriptorSetKHR
in that partial updates are not supported; all descriptors used in
subsequent draw commands must be pushed at the same time.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-14 23:21:24 +02:00
Bas Nieuwenhuizen 4f7fb25d4e radv: Add more trace points.
Most trace points happen after an operation, so add a trace point
at the start of the command buffer.

Furthermore, add one after a CmdUpdateBuffer using CP_DMA as that
didn't emit one yet.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-04-13 16:06:47 +02:00
Alex Smith 4603bea1aa radv: Disable primitive restart for non-indexed draws
According to the Vulkan spec, VkPipelineInputAssemblyStateCreateInfo's
primitiveRestartEnable flag should only apply to indexed draws, however
it was being enabled regardless of the type of draw. This could cause
problems for non-indexed draws with >=65535 vertices if the previous
indexed draw used 16-bit indices.

Fixes corruption of the credits text in Mad Max.

v2: Reset primitive restart state after executing a secondary command
    buffer.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-12 20:58:41 +02:00
Fredrik Höglund fd0f539e60 radv: don't call radeon_check_space in radv_BindDescriptorSets
This appears to be a leftover from an earlier version of this function.
Nothing is emitted into the CS.

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-07 00:54:46 +02:00
Fredrik Höglund c1f8c83cb6 radv: implement VK_KHR_descriptor_update_template
All offsets and strides are precomputed by
radv_CreateDescriptorUpdateTemplateKHR and stored in the template.

v2: Move the new struct declarations from radv_descriptor_set.h
    to radv_private.h (Bas)

Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-07 00:54:46 +02:00
Fredrik Höglund c6487bc48b radv: implement VK_KHR_push_descriptor
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-04-07 00:54:46 +02:00
Dave Airlie 1171b304f3 radv: overhaul fragment shader sample positions.
The current code was broken, and I decided to redesign it instead.

This puts the sample positions for all samples into the queue
constant descriptor buffer after all the spill/ring descriptors.

It then uses a single offset register to point how far into the
samples the samples for num_samples are. This saves one user sgpr
and means we only generate the sample position data in the rare
single case where we need it currently.

This doesn't fix the failing CTS tests without the followup
fix.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-04 05:55:15 +10:00
Dave Airlie b4495b71c6 radv/cmd: emit tessellation state.
This emits the tessellation shaders and state to the command stream.

It contains the logic to emit the LS/HS shaders.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:16:57 +10:00
Dave Airlie aeb49bc2b9 radv: port polaris vgt vertex reuse workaround.
This ports the VGT_VERTEX_REUSE register settings
for Polaris GPUs from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:51 +10:00
Dave Airlie 46e52df34d radv: add tessellation ring allocation support. (v2)
This patch adds support for the offchip rings for storing
tessellation factors and attribute data.

It includes the register setup for the TF ring

v2: always do tess ring size calcs (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:30 +10:00
Dave Airlie a4b039db04 radv: add tess shader stage user data support.
This just adds support for tess to the shader stage conversion
and emits the per-stage descriptors/constants for tess stages.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-04-01 07:15:15 +10:00
Bas Nieuwenhuizen 0f3de89a56 radv: Use the guard band.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-30 22:21:14 +02:00
Bas Nieuwenhuizen 8a53e6e4c5 radv: Prepare for not using the guard band for lines & points.
Vulkan Clipping is defined in terms of vertices, the scissor based
clipping happens on pixels. There is a difference with points and
lines, as a vertex can be outside the viewport while some pixels are in.
On Vulkan thoise pixels shouldn't be drawn, while they would be with
the guardband.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-30 22:21:14 +02:00
Dave Airlie 93d61e4945 radv: only emit ps_input_cntl is we have any to output
Otherwise we get GPU hangs.

Reported-by: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 20:12:10 +01:00
Dave Airlie 239a9224a3 radv: move shader stages calculation to pipeline.
With tess this becomes a bit more complex. so move to pipeline
for now.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:33 +10:00
Dave Airlie 0232ea8025 radv: move pa_cl_vs_out_cntl calculation to pipeline
This also takes the side band setting code from radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:29 +10:00
Dave Airlie 92e9c14a6a radv: move calculating fragment shader i/os to pipeline.
There is no need to calculate this on each command submit.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:20 +10:00
Dave Airlie 4b467c759e radv: move shader_z_format calculation to pipeline.
No need to recalculate this every time.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:17 +10:00
Dave Airlie 8996fdbf61 radv: move db_shader_control calculation to pipeline.
There is no need to recalculate this every time.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:14 +10:00
Dave Airlie cd33a5c1cb radv: move vgt_gs_mode value to pipeline.
No need to recalculate this everytime.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:40:08 +10:00
Dave Airlie 931a8d0c9a radv: rework vertex/export shader output handling
In order to faciliate adding tess support, split the vs/es
output info into a separate block, so we make it easier to
have the tess shaders export the same info.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:39:59 +10:00
Dave Airlie ae0551b4b3 radv: fix ia_multi_vgt_param for instanced vs indirect draw.
The logic was different than radeonsi, fix it up before adding
tess support.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-28 17:39:55 +10:00
Bas Nieuwenhuizen a8c51b1cd9 radv: flush DB cache before and after HTILE decompress.
It reads @ writes the DB cache, and we haven't flushed dst caches yet,
so DB cache may be stale. Also the user might be shader read (and probably is),
so also flush after.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
CC: <mesa-stable@lists.freedesktop.org>
Fixes: f4e499ec79 ("radv: add initial non-conformant radv vulkan driver")
2017-03-28 02:51:40 +02:00
Alex Smith bc5d587a80 radv: Invalidate L2 for TRANSFER_WRITE barriers
CP DMA and PKT3_WRITE_DATA (in CmdUpdateBuffer) don't (currently) write
through L2. Therefore, to make these writes visible to later accesses
we must invalidate L2 rather than just writing it back, to avoid the
possibility that stale data is read through L2.

Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-23 09:20:31 +10:00
Dave Airlie d06e168b87 radv: fix primitive reset index emission
This was meant to be checking the index type to get the correct
index not the last emitted one. This fixes:
dEQP-VK.pipeline.input_assembly.primitive_restart.index_type_uint32.triangle_strip_with_adjacency

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-03-20 08:47:03 +10:00
Alex Smith c19607d59d radv: Reinitialise loaderMagic when allocating a cached command buffer
This must be set to ICD_LOADER_MAGIC by vkAllocateCommandBuffers, which
was being done when allocating a new buffer but not when reusing an
existing one in the cache. This would hit an assertion and crash in
debug builds of the Vulkan loader.

Fixes: 682248db45 ("radv: Cache command buffers in command pool.")
Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-13 23:42:36 +01:00
Bas Nieuwenhuizen 8700329785 radv: Don't emit cache flushes on subpass switch.
I think we should only flush right before an action (draw/dispatch etc.),
as otherwise it is too easy to issue redundant flushes.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:23 +01:00
Bas Nieuwenhuizen 9251f8b35e radv: Only flush for the needed stages, and before the flushes.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:19 +01:00
Bas Nieuwenhuizen f92a118434 radv: Don't invalidate CB/DB for images that aren't modified outside CB/DB.
Without stores, the only writes are fast clears, transfers and metadata
initialization, each of which have the appropiate invalidations already.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:14 +01:00
Bas Nieuwenhuizen 0567ab0407 radv: Flush more caches after writes.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:10 +01:00
Bas Nieuwenhuizen 7a600bbc81 radv: Don't flush for fixed-function reading.
The data should always be in memory after a src flush.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:05 +01:00
Bas Nieuwenhuizen dd094e4ff9 radv: Invalidate the correct caches for CB/DB dst barriers.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:35:01 +01:00
Bas Nieuwenhuizen b075eb7d47 radv: Determine cache flushes per object.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-09 02:34:42 +01:00
Fredrik Höglund 0941d1a574 radv: fix the dynamic buffer index in vkCmdBindDescriptorSets
This fixes the wrong dynamic buffer descriptors being updated when
firstSet > 0.

Cc: 17.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-07 20:23:04 +01:00
Bas Nieuwenhuizen 6424795f52 radv: Use the subresource range in HTILE initialization.
v2: fix levelCount assert.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-07 09:58:33 +01:00
Bas Nieuwenhuizen 3b455c1cb7 radv: Use winsys HTILE info.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-07 09:58:27 +01:00
Alex Smith 290d7e892d radv: Emit pending flushes before executing a secondary command buffer
If we have any pending flushes on the primary command buffer, these
must be performed before executing the secondary buffer.

This fixes potential corruption when the contents of a subpass which
clears any of its render targets are given in a secondary buffer: the
flushes after a fast clear would not have been performed until the
vkCmdEndRenderPass call.

Signed-off-by: Alex Smith <asmith@feralinteractive.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
2017-03-06 19:46:14 +01:00
Bas Nieuwenhuizen f3dc318464 radv: Use the new L2 writeback flag.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 09:16:05 +01:00
Grazvydas Ignotas a5446e3187 radv: check for upload alloc failure
Mainly to avoid gcc's complains about uninitialized ptr and offset use
later in that code.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-03-06 00:10:42 +01:00
Bas Nieuwenhuizen 682248db45 radv: Cache command buffers in command pool.
So that we don't keep allocating BOs for the IBs and upload buffers.

We run some risk of memory increase with e.g. a bimodal size
distribution of command buffers, but I haven't noticed a significant
increase with dota2 and talos.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-03-06 00:07:51 +01:00
Bas Nieuwenhuizen bb878db7eb radv: Reset emitted compute pipeline when calling secondary cmd buffer.
Otherwise if the new compute pipeline is the same as the last used
pipeline before the call, we don't emit it again.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Cc: 13.0 17.0 <mesa-stable@lists.freedesktop.org>
2017-02-27 01:33:10 +01:00
Dave Airlie ccb70d6f53 radv: add sample mask output support
This adds support to write to sample mask from the fragment shader.

We can optimise this later like radeonsi.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:53 +10:00
Dave Airlie 58c97a0791 radv: enable location at sample when persample is forced.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:30 +10:00
Dave Airlie fc430c391b radv: fix interpolation at wrong place for offset interp
The code was interpolating at the offset from the sample,
not the offset from the center. Also fix for persample interpolation
modes we should force the pixel center to be at the sample.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-24 10:31:19 +10:00
Dave Airlie 40e0dbf96c radv: fix typo in the subpass barrier patch.
Fixes: dbb0eaccc radv: handle subpass cache flushes

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-22 02:22:30 +00:00
Bas Nieuwenhuizen 8cff852ae2 radv: Don't flush at the start of a command buffer.
The preamble flushes now and the rest is the responsibility of the app.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-02-21 09:20:03 +01:00
Dave Airlie 6dbb0eaccc radv: handle subpass cache flushes
This splits out the cache flush bit setting code
dependent on the src/dest access flags.

It then calls it from the subpass barrier code.

It also marks a TODO to remove the aggressive CS/PS
flushes at some point.

This fixes a bunch of the
dEQP-VK.renderpass.attachment_allocation.input_output.*
tests.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-21 09:48:37 +10:00
Dave Airlie 1f6376935b Revert "radv: detect command buffers that do no work and drop them (v2)"
This just keeps popping up minor problems and regressions we should
revisit in a more sustainable manner later.

This also reverts:
Revert "radv: query cmds should mark a cmd buffer as having draws."
Revert "radv: also fixup event emission to not get culled."

This reverts commit d1640e7932.
This reverts commit 8b47b97215.
This reverts commit b4b19afebe.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-20 09:00:40 +10:00
Dave Airlie 9aec76aca3 radv: handle layered fast clears.
This iterates the fast clear flush across the layers in the
specified range.

It also moves the compute resolve flush into the function
and builds the range in there.

This fixes:
dEQP-VK.geometry.layered.* regressions since fast clears.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-02-19 20:30:01 +10:00
Dave Airlie efc89edf5a radv: pass subresourceRange by pointer.
This struct is 5 dwords, we should really just pass a pointer
to it.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-19 20:28:22 +10:00
Bas Nieuwenhuizen c7fcaf2314 radv: Invert ring SGPR check.
I assume this wants to check if all pipelines use the same SGPR for
the rings.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-19 10:13:11 +01:00
Dave Airlie b4b19afebe radv: also fixup event emission to not get culled.
This is possibly a bad idea, I might have to consider a better one.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-15 00:36:30 +00:00
Dave Airlie 3360dbe0c1 radv: fixup IA_MULTI_VGT_PARAM handling.
This ports the remains of the workarounds from radeonsi for
the non-TESS cases. It should provide equivalent workarounds
for hawaii and bonarie.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-14 20:29:19 +00:00
Dave Airlie 592069c1fb radv: use indirect buffer for initial gfx state.
This puts the common gfx state for the device into an
indirect buffer, and just calls out to it, on CIK and above.

This is taken from what radeonsi does.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-13 20:02:45 +00:00
Dave Airlie 604e562e5b radv: don't pass physical device to si_init_ fns.
This is just a trivial cleanup.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-13 20:02:06 +00:00
Dave Airlie 8b47b97215 radv: detect command buffers that do no work and drop them (v2)
If a buffer is just full of flushes we flush things on command
buffer submission, so don't bother submitting these.

This will reduce some CPU overhead on dota2, which submits a fair
few command streams that don't end up drawing anything.

v2: reorganise loop to count first then malloc,
rename some vars (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-13 20:00:28 +00:00
Dave Airlie cda9f3d8ec radv: handle VK_QUEUE_FAMILY_IGNORED in image transitions (v3)
The CTS tests at least are using this, and we were totally
ignoring it.

This hopefully fixes the bouncing multisample CTS tests.

v2: get family mask in ignored case from command buffer.
v3: only change things in one place, use logic from Bas.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-02-02 08:25:04 +10:00
Bas Nieuwenhuizen cf8a11c1ba radv: Pass draw index to shader.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
2017-02-01 19:49:40 +01:00
Dave Airlie ca822e1b7c radv: handle layer export from vs->fs properly
Fixes:
dEQP-VK.geometry.layered.1d_array.fragment_layer

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:49 +10:00
Dave Airlie c9c8ae1fd3 radv: emit esgs itemsize register.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:46 +10:00
Dave Airlie 77ec78669a radv: handle prim id inputs to fragment shader.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:41 +10:00
Dave Airlie 105ce24d46 radv: emit geometry shaders to hardware
This emits the compiled geometry shader and other state registers.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:37 +10:00
Dave Airlie 1fa5b755c2 radv: emit geometry ring size and pointers via preamble (v2)
This uses the scratch infrastructure to handle the esgs
and gsvs rings.

(this replaces the old code that did this with patching).

v2: fix correct ring sizes, reset sizes (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:30:19 +10:00
Dave Airlie 68a77411e1 radv: emit vertex shader to correct hw block.
This emits the shader to the ES block in the correct case.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:27 +10:00
Dave Airlie b941a88e01 radv: extend shader stage code to cover geometry shaders.
This enables the paths for setting up user ptrs to vs/es and gs.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:29:20 +10:00
Dave Airlie ecb8a34910 radv: move hw vertex shader emit to separate function
This is to later allow ES shaders to be emitted.

Review-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 09:28:46 +10:00
Robert Foss 0b63f47030 radv: Prevent Coverity warning
Prevent Coverity seeing potential errors when src is
no initialized in the switch case.

Coverity-Id: 1396397
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-30 23:59:22 +01:00
Dave Airlie a1c1ba7d56 radv: handle transfer_write as a dst flag.
It appears we can get image barriers like:
    srcStageMask:                   VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
    dstStageMask:                   VkPipelineStageFlags = 4096 (VK_PIPELINE_STAGE_TRANSFER_BIT)
    dependencyFlags:                VkDependencyFlags = 0
    memoryBarrierCount:             uint32_t = 0
    pMemoryBarriers:                const VkMemoryBarrier* = NULL
    bufferMemoryBarrierCount:       uint32_t = 0
    pBufferMemoryBarriers:          const VkBufferMemoryBarrier* = NULL
    imageMemoryBarrierCount:        uint32_t = 1
    pImageMemoryBarriers:           const VkImageMemoryBarrier* = 0x7ffc882367b0
        pImageMemoryBarriers[0]:        const VkImageMemoryBarrier = 0x7ffc882367b0:
            sType:                          VkStructureType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER (45)
            pNext:                          const void* = NULL
            srcAccessMask:                  VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
            dstAccessMask:                  VkAccessFlags = 4096 (VK_ACCESS_TRANSFER_WRITE_BIT)
            oldLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL (7)
            newLayout:                      VkImageLayout = VK_IMAGE_LAYOUT_GENERAL (1)
            srcQueueFamilyIndex:            uint32_t = 4294967295
            dstQueueFamilyIndex:            uint32_t = 4294967295
            image:                          VkImage = 0x2df55e0
            subresourceRange:               VkImageSubresourceRange = 0x7ffc882367e0:
                aspectMask:                     VkImageAspectFlags = 1 (VK_IMAGE_ASPECT_COLOR_BIT)
                baseMipLevel:                   uint32_t = 0
                levelCount:                     uint32_t = 1
                baseArrayLayer:                 uint32_t = 0
                layerCount:                     uint32_t = 1

This fixes all the CTS dEQP-VK.memory.pipeline_barrier.transfer_dst tests here,
not sure if this is a too large hammer.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-31 04:42:21 +10:00
Andres Rodriguez e199a993b2 radv: vkAllocateCommandBuffers should NULL all output handles
This is part of the spec and fixes CTS tests:
dEQP-VK.api.object_management.alloc_callback_fail_multiple.command_buffer_*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-30 08:38:13 +01:00
Andres Rodriguez ec0f5c005c radv: add trim command pool stub
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2017-01-30 08:37:54 +01:00
Bas Nieuwenhuizen ccff93e138 radv: Track scratch usage across pipelines & command buffers.
Based on code written by Dave Airlie.

Signed-off-by: Bas Nieuwenhuizen <basni@oogle.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-01-30 02:07:16 +01:00
Dave Airlie 2ab2be092d radv: program a default point size.
Along the lines of what
3b804819 anv: Default PointSize to 1.0 if not written by the shader
does for anv, program a default point size in the hw of 1.0.

This preempt fixes a bunch of geom shader tests.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "17.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-25 09:58:38 +10:00
Dave Airlie aac562f112 radv: disable vertex reuse when writing viewport index
This fixes some issues we'd hit later if using viewport
indexes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-18 08:04:11 +10:00
Dave Airlie 6b635bbe16 radv: add support for writing layer/viewport index (v2)
This just adds the infrastructure to allow writing layer
and viewport index. It's just a first patch out of the geom
shader tree, and doesn't do much on its own.

v2: add missing if statement change (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-01-18 06:20:44 +10:00
Bas Nieuwenhuizen 8406f79d6a radv: Get physical device from radv_device instead of the instance.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-01-16 22:15:22 +01:00
Bas Nieuwenhuizen 97dfff5410 radv: Dump command buffer on hang.
v2:
  - Now use the filename specified by RADV_TRACE_FILE env var.
  - Use the same var to enable tracing.

I thought we could as well always set the filename explicitly
instead of having some arbitrary defaults, and at that point
we don't need a separate feature enable.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2017-01-09 21:44:03 +01:00
Bas Nieuwenhuizen 059af2515a radv: Also skip DCC clear flushes for compute.
(airlied: fixes DOOM hang with compute queue enabled)
Reviewed-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
2016-12-27 03:27:13 +00:00
Dave Airlie 9d23b8a18e radv: flush smem for uniform buffer bit.
(cc'ing stable as I'd like to backport the ubo speedup as well)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-12-21 22:31:14 +00:00
Bas Nieuwenhuizen accc5fc026 radv: Don't enable CMASK on compute queues.
We can't fast clear on compute queues.

Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-12-18 20:52:41 +01:00
Bas Nieuwenhuizen 9b0efc98ba radv: Implement indirect dispatch for the MEC.
Signed-off-by: Bas Nieuwenhuizen <basni@google.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-12-18 20:52:33 +01:00
Dave Airlie d0e6fb0574 radv: init compute queue and avoid initing transfer queues
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-12-18 20:52:26 +01:00
Dave Airlie 94a7434bbc radv: Store queue family in command buffers.
v2: Added helper (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-12-18 20:52:15 +01:00
Grazvydas Ignotas 9bff2c9884 radv: fix release build unused variable warnings
Just mark with MAYBE_UNUSED.

Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2016-12-10 21:46:56 +01:00
Dave Airlie ae61ddabe8 radv: move userdata sgpr ownership to compiler side.
This isn't fully what we want yet, but is a good step on the way.

This allows the compiler to create the information structures
for the state setting side, however the state setting still expects
things to be pretty much in 2 sgpr wide register sets, and can't
handle the indirect setting yet.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-12-07 23:25:49 +00:00
Dave Airlie 221ab77956 radv: refactor out the constant setting user sgpr code.
This just refactors out some common code to make future changes
easier to understand.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-12-07 23:25:45 +00:00
Dave Airlie 11208f0049 radv: refactor out the descriptor user sgpr setting.
This just splits some common code into a utility function.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-12-07 23:25:42 +00:00
Dave Airlie a74a4edc90 radv: only bind descriptor sets to stages that need them
This copies the push constant code and only binds descriptor
sets to the stages that need them. It also now has to dirty
descriptors on pipeline binds.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-12-07 23:25:38 +00:00
Dave Airlie 85118a1e4d radv: move descriptor set userdata emission to draw flush time.
This is another step towards having the compiler decide the
user sgpr layout.

This still emits the descriptors sets for all shader types, but
we will fix this later.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-12-07 23:25:28 +00:00
Dave Airlie a5d10844ee radv: refactor descriptor set userdata emission out.
This just moves this into a separate function.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-12-07 23:25:18 +00:00
Dave Airlie f847676990 radv: pass pipeline to constant flush function
I'll need this later rather than just the layout.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-12-07 23:25:15 +00:00
Dave Airlie eb2ba5c8df radv: consolidate compute pipeline flushing (v1.1)
This just moves some common code into a utility function
to avoid having to change multiple places later.

v1.1: rename function to better reflect what it does. (Bas)

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-12-07 23:24:53 +00:00
Dave Airlie 048143b9d9 radv: set spi_baryc_cntl.pos_float_location to 0
This fixes:
dEQP-VK.pipeline.multisample_interpolation.offset_interpolate_at_sample_position.*

This should probably be 2 when sample shading is enabled, but I'm
not sure.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-11-29 22:48:23 +00:00
Dave Airlie f3a3fea973 radv: force persample shading when required.
We need to force persample shading when
a) shader uses sample_id
b) shader uses sample_position
c) shader uses sample qualifier.

Also since ps_iter_samples can now change independently of the
rasterizer samples we need to move setting the regs more often.

This fixes:
dEQP-VK.pipeline.multisample_interpolation.centroid_interpolate_at_consistency.*
dEQP-VK.pipeline.multisample_interpolation.centroid_qualifier_inside_primitive.137_191_1.*
dEQP-VK.pipeline.multisample_interpolation.sample_interpolate_at_distinct_values.*
dEQP-VK.pipeline.multisample_interpolation.sample_qualifier_distinct_values.128_128_1.*

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-11-29 22:48:03 +00:00
Fredrik Höglund 28c781b574 radv: add support for VK_AMD_draw_indirect_count
Reviewed-by: Dave Airlie <airlied@redhat.com>
2016-11-24 08:19:27 +10:00
Dave Airlie ea417f5335 radv: move pipeline barrier image transitions after src flushing
This seems like it would conform better with the spec.

noticed while digging into fast clears.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-11-23 10:16:34 +10:00
Dave Airlie 51a44c0021 radv: make sure to flush input attachments correctly.
This fixes 9 of the
dEQP-VK.renderpass.attachment_allocation.input_output.*
tests.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-11-21 08:57:31 +10:00
Dave Airlie a969548f59 radv: allow cmask transitions without fast clear
This fixes
dEQP-VK.pipeline.multisample.sampled_image*

These all render to multisampled image, and then
sample from it, so we must transition it correctly,
since we have a cmask and fmask this will cause
the correct transition.

Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-24 11:03:09 +10:00
Dave Airlie 86c4575a81 radv: decompress fmask before reading using texture unit
Before we can read the fmask using the compute shader, we need
to decompress the fmask in place.

This fixes a bunch of remaining failure and hopefully multisampling
in Talos.
2016-10-19 17:39:47 +10:00
Dave Airlie b0e11a153c radv: start using defines for the user sgpr offsets
This adds some comments and adds defines for the user sgprs,
so that we can move them around easier later and not have
to change/revalidate every one of these.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-19 10:17:48 +10:00
Dave Airlie 4450f40519 radv: move to using shared vk_alloc inlines.
This moves to the shared vk_alloc inlines for vulkan
memory allocations.

Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-19 09:05:26 +10:00
Dave Airlie f4e499ec79 radv: add initial non-conformant radv vulkan driver
This squashes all the radv development up until now into
one for merging.

History can be found:
https://github.com/airlied/mesa/tree/semi-interesting

This requires llvm 3.9 and is in no way considered
a conformant vulkan implementation. It can run a number
of vulkan applications, and supports all GPUs using
the amdgpu kernel driver.

Thanks to Intel for providing anv and spirv->nir,
and Emil Velikov for reviewing build integration.

Parts of this are:
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Acked-by: Edward O'Callaghan <funfunctor@folklore1984.net>

Authors: Bas Nieuwenhuizen and Dave Airlie
Signed-off-by: Dave Airlie <airlied@redhat.com>
2016-10-07 09:16:09 +10:00