Commit Graph

564 Commits

Author SHA1 Message Date
Samuel Pitoiset 24db276d11 radv/sqtt: describe pipeline and wait events barriers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
2020-03-10 10:05:40 +01:00
Samuel Pitoiset ac0d5b6b11 radv/sqtt: describe render pass color/depthstencil clears
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
2020-03-10 10:05:40 +01:00
Samuel Pitoiset b829fbb7f0 radv/sqtt: describe draw/dispatch and emit event markers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
2020-03-10 10:05:40 +01:00
Samuel Pitoiset dcfc08f5b8 radv/sqtt: describe begin/end command buffers with user markers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
2020-03-10 09:58:02 +01:00
Samuel Pitoiset be700775dc radv/sqtt: add a helper that emits thread trace userdata markers
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4031>
2020-03-10 09:57:56 +01:00
Samuel Pitoiset d555794f30 radv: update entrypoints generation from ANV
It's a massive rework loosely based on ANV. This introduces separate
dispatch tables for the instance, physical device and device objects.

This will help for implementing internal driver layers for SQTT.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3930>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3930>
2020-03-02 11:51:43 +00:00
Samuel Pitoiset 79d4d2807f radv/sqtt: add support for GFX10
All SQTT registers were moved to privileged space on GFX10, to emit
them we need a workaround with COPY_DATA.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4018>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4018>
2020-03-02 12:23:43 +01:00
Samuel Pitoiset bf16ff3172 radv: allow to capture SQTT traces with RADV_THREAD_TRACE=<start_frame>
This is pretty basic (and a bit crappy at the moment). I think we
might want some sort of overlay in the future and also be able to
trigger captures with F12 or whatever.

To record a capture, set RADV_THREAD_TRACE to something greater than
zero (eg. RADV_THREAD_TRACE=100 will capture frame #100). If the
driver didn't crash (or the GPU didn't hang), the capture file
should be stored in /tmp.

To open that capture, use Radeon GPU Profiler and enjoy your
profiling times with RADV! \o/

Note that thread trace support is quite experimental, only GFX9 is
supported at the moment, and a bunch of useful stuff are still missing
(shader ISA, pipelines info, etc). More is comming soon.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
2020-02-28 08:11:11 +01:00
Samuel Pitoiset ed0c852243 radv: add initial SQTT files generation support
SQTT is also a file format (.rgp extension) that can be consumed
by Radeon GPU Profiler for profiling purposes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
2020-02-28 08:11:07 +01:00
Samuel Pitoiset 768d4f0551 radv: add initial SQ Thread Trace support for GFX9
SQTT is a hardware block that collects thread trace data (like
wave occupancy, timings, etc) for every draw/dispatch calls.

It's only supported on GFX9 at the moment but I will add other
generations support soon.

This is the first step towards profiling with RADV!

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
2020-02-28 08:10:55 +01:00
Samuel Pitoiset 94099ee642 radv: add a small helper that allows to submit internal CS
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3900>
2020-02-28 08:10:53 +01:00
Samuel Pitoiset b9e0947a9e radv: remove unused RADV_HASH_SHADER_IS_GEOM_COPY_SHADER
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3789>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3789>
2020-02-13 08:09:13 +00:00
Samuel Pitoiset b2531370c9 radv: remove RADV_DEBUG=nosisched and RADV_PERFTEST=sisched
They are no longer useful.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3789>
2020-02-13 08:09:13 +00:00
Samuel Pitoiset 556c940149 radv: implement VK_EXT_line_rasterization
Only Bresenham lines are supported. GFX9 is currently disabled
because there is some CTS failures for some weird reasons.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2982>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/2982>
2020-02-13 08:14:01 +01:00
Bas Nieuwenhuizen 5b335e1599 radv: Do not redundantly set the RB+ regs on pipeline switch.
No significant perf changes seen on Bayonetta. (Changes are in the
noise on my Raven Laptop)

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3735>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3735>
2020-02-11 04:39:42 +00:00
Samuel Pitoiset e4752dafed radv/gfx10: implement NGG GS queries
The number of generated primitives is only counted by the hardware
if GS uses the legacy path. For NGG GS, we need to accumulate that
value in the NGG GS itself. To achieve that, we use a plain GDS
atomic operation.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
2020-01-29 17:40:48 +01:00
Samuel Pitoiset 3c1f657f35 radv/gfx10: add a separate flag for creating a GDS OA buffer
For implementing NGG GS queries, we decided to use GDS but GDS OA
is only required for NGG streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3380>
2020-01-29 17:40:46 +01:00
Samuel Pitoiset b537be4368 radv: update VK_KHR_depth_stencil_resolve for Vulkan 1.2
Promoted to Vulkan 1.2 with the KHR suffix omitted.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-15 08:42:25 -06:00
Samuel Pitoiset 0758f645d0 radv/gfx10: determine if a pipeline is eligible for NGG passthrough
It can't be enabled for geometry shaders, for NGG streamout and
for vertex shaders that export the primitive ID. NGG passthrough
requires that LDS isn't used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2020-01-13 08:14:40 +01:00
Bas Nieuwenhuizen 7cc0702bbb radv: Emit a BATCH_BREAK when changing pixel shaders or CB_TARGET_MASK.
Fixes a hang on Raven with Resident Evil 2.

I did not find anything more restricted to fix it:

- Setting persistent_states_per_bin to 1 fixes it too,
  but likely does an internal break on any descriptor set changes
  too.
- Only breaking the batch when cb_target_mask changes does not fix
  it (and looking at AMDVLK comments, I suspect the code in radeonsi
  should really be doing a FLUSH_DFSM).
- Always doing a FLUSH_DFSM on shader switch helps, but that is more
  often than this and I don't think we should be doing that when DFSM
  is disabled.
- Also emitting the existing break on framebuffer change when DFSM is
  disabled does not fix the issue.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2315
CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2020-01-07 22:44:31 +01:00
Samuel Pitoiset 7bbf497b68 radv: record number of color/depth samples for each subpass
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3018>
2020-01-03 12:31:53 +00:00
Eric Engestrom 51569e525a amd: fix empty-body issues
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-12-27 22:09:00 +00:00
Samuel Pitoiset e4c8491bdf radv: implement VK_KHR_separate_depth_stencil_layouts
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-12-10 13:16:17 +01:00
Samuel Pitoiset 905c005561 radv: create decompress pipelines for separate depth/stencil layouts
No functional changes as the driver still uses the depth+stencil
pipeline.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-25 16:29:21 +01:00
Connor Abbott 66c703b3e8 radv: Move argument declaration out of nir_to_llvm
Now it's executed for ACO too.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-25 14:17:51 +01:00
Timothy Arceri ef54f15da9 radv: add some infrastructure for fresh forks for each secure compile
In the following commits we want to be able to fork an existing lightweight
fork created at device creation time. In order for the user facing process
to communicate with this new fresh fork we create some members here to hold
FIFO file descriptors and a unique id.

Here we also add a new fork enum that we use to tell the lightweight
process to create a fresh fork.

For more information on why we create a fresh fork see the following
commits.
2019-11-25 10:10:14 +11:00
Bas Nieuwenhuizen 4eb2a1dc6f radv: Do not change scratch settings while shaders are active.
When the scratch ringbuffer settings are changed, the shader unit has
to be idle or we will have shaders using old and new settings.

That combination is not supported on the HW (likely the offset is
ringbuffer idx * WAVESIZE * 1024).

CC: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-11-20 01:18:36 +00:00
Samuel Pitoiset 1ebd9459e7 radv: implement VK_AMD_device_coherent_memory
This extension adds the device coherent and device uncached memory
types. It's known to be slower than non-device coherent memory but
it might be useful for debugging.

This is only exposed for chips that support L2 uncached.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-18 08:20:19 +00:00
Samuel Pitoiset 519d9b30de radv: remove useless RADV_DEBUG=unsafemath debug option
This option is useless and shouldn't be used at all.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-15 09:07:34 +01:00
Samuel Pitoiset fb07fd4e6c radv: implement VK_EXT_subgroup_size_control
This extension allows to control the subgroup size by allowing a
varying subgroup size and also specifying a required subgroup size.

This implementation only allows to specify a required subgroup
size for compute shaders because there is some caveats with
other shader stages (eg. NGG with geometry shader). This
basically allows apps to use Wave32 for compute shaders.

This extension is enabled for all chips but only GFX10 supports
Wave32. ACO doesn't support it.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-11-06 09:20:39 +01:00
Bas Nieuwenhuizen 4aa75bb3bd radv: Add wait-before-submit support for timelines.
This is actually a non-threaded implementation. I'd summarize this
as event-based submission.

When submit happens we walk a tree of submissions that depend on
the syncobj signal operations to be submitted and if those submission
we no other dependencies we start to execute them immediately.

Or, well I still use a list to avoid issues with long chains and
the stacksize when using recursion.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen 88d41367b8 radv: Add timelines with a VK_KHR_timeline_semaphore impl.
This does not fully do wait-before-submit, to be done in a follow
up patch.

For kernels without support for timeline syncobjs, this adds an
implementation of non-shareable timelines using legacy syncobjs.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Bas Nieuwenhuizen c3eae659e7 radv: Split semaphore into two parts as enum+union.
This is in preparation to adding more types.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-30 11:57:07 +01:00
Timothy Arceri 28fff3efbc radv: add radv_sc_read() helper
This is a function with timeout support for reading from the pipe
between processes used for secure compile.

Initially we hardcode the timeout to 5 seconds. We can adjust the
timeout limit in future if needed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-30 04:49:58 +00:00
Eric Engestrom c2430f3edc radv: fix empty-body instruction
Fixes: 8d43e2b2de ("meson: add -Werror=empty-body to disallow `if(x);`")
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-10-27 22:10:31 +00:00
Timothy Arceri 5d25aee005 radv: add radv_device_use_secure_compile() helper
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri d33f2165c9 radv: add some new members to radv device and instance for secure compile
These will be used by the following commits to hold information about
the forked secure compile processes.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Timothy Arceri e8cb13d499 radv: add radv_secure_compile_type enum
This will be used to identify information being passed between the
parent and secure process during a secure compile.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-26 13:04:12 +11:00
Samuel Pitoiset f11ea22666 radv: fix a performance regression with graphics depth/stencil clears
I recently changed the slow depth/stencil clear path to make sure
depth values are explicitly exported by the fragment shader. This
is actually only useful when VK_EXT_depth_range_unrestricted is
enabled.

While this path is correct, it introduced a performance regression
with Heroes of the Storm, Shadow of Mordor (Vulkan beta) and
probably more titles. This is because it prevents the hardware
to do some optimizations like discarding fragments.

This commit re-introduces the previous (a bit faster) slow
depth/stencil clear path and it selects the unrestricted path
only if VK_EXT_depth_range_unrestricted is enabled.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/863
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 10:23:47 +02:00
Samuel Pitoiset f4ab58c1a0 radv: do not create meta pipelines with 16 samples
The driver only supports up to 8 samples, so it's useless to
create more pipelines than needed.

This fixes a conditional jump reported by Valgrind on GFX10:

==194282== Conditional jump or move depends on uninitialised value(s)
==194282==    at 0xDBF925A: radv_gfx10_compute_bin_size (radv_pipeline.c:3242)
==194282==    by 0xDBF95A6: radv_pipeline_generate_binning_state (radv_pipeline.c:3334)
==194282==    by 0xDBFC1A0: radv_pipeline_generate_pm4 (radv_pipeline.c:4440)
==194282==    by 0xDBFD15E: radv_pipeline_init (radv_pipeline.c:4764)
==194282==    by 0xDBFD23E: radv_graphics_pipeline_create (radv_pipeline.c:4788)
==194282==    by 0xDBB95A3: create_pipeline (radv_meta_clear.c:114)
==194282==    by 0xDBB9AC5: create_color_pipeline (radv_meta_clear.c:297)
==194282==    by 0xDBBCF05: radv_device_init_meta_clear_state (radv_meta_clear.c:1277)
==194282==    by 0xDB9ACD9: radv_device_init_meta (radv_meta.c:363)
==194282==    by 0xDB7FE3A: radv_CreateDevice (radv_device.c:2080

This is caused by an out of bound access of 'fmask_array' (ie. index
is 4 as for 16 samples).

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-23 08:33:08 +02:00
Samuel Pitoiset ea92273cea radv: fix DCC fast clear code for intensity formats
This fixes a rendering issue with DiRT 4 on GFX10. Only GFX10 was
affected because intensity formats are different.

Cc: 19.2 <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1923
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-14 08:36:14 +02:00
Bas Nieuwenhuizen dad047a56a radv: Expose image handle compat types for Android handles.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen 1b0ceba925 radv: Allow Android image binding.
Using delayed layout of images.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen 83a012b603 radv/android: Add android hardware buffer import/export.
Support does not include images yet.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen adad61239c radv: Deal with Android external formats.
To abstract things a bit, this adds a helper function in radv_android.c.
However, this means we have to link in radv_android.c on non-android as
well, which means some scaffolding changes.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen 041fc7beb8 radv: Derive android usage from create flags.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen a34e4dd0d2 radv/android: Add android hardware buffer field to device memory.
You cannot go from BO to Android hardware buffer, so for export we
have to remember it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen 9ea72b5337 radv: Add VK_ANDROID_external_memory_android_hardware_buffer.
Still disabled but now we can add entrypoints.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Samuel Pitoiset 9d17d97ee4 radv: use a compute shader for copying timestamp query results
When the timestamp is not ready (ie. UINT64_MAX), the availabily bit
should be zero. The previous code used to copy the timestamp value
as the availabily bit and that's completely wrong.

Because it's not that simple to emit a conditional with the CP, the
driver now uses a compute shader for copying timestamp query results.

Fixes dEQP-VK.pipeline.timestamp.misc_tests.reset_query_before_copy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-10 13:23:22 +02:00
Samuel Pitoiset e19d1ee2d1 radv/gfx10: fix NGG streamout with triangle strips for VS
The number of vertices has to be adjusted with the output primitive
type.

This fixes dEQP-VK.transform_feedback.simple.triangle_strip_*.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:09:35 +02:00
Samuel Pitoiset 683c5e27c7 radv/gfx10: add radv_device::use_ngg
Trivial.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-02 18:06:01 +02:00
Daniel Schürmann a70a998718 radv/aco: Setup alternate path in RADV to support the experimental ACO compiler
LLVM remains default and ACO can be enabled with RADV_PERFTEST=aco.

Co-authored-by: Daniel Schürmann <daniel@schuermann.dev>
Co-authored-by: Rhys Perry <pendingchaos02@gmail.com>

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-19 12:10:00 +02:00
Samuel Pitoiset e1dc3ab753 radv/gfx10: allocate GDS/OA buffer objects for NGG streamout
This allocates two BOs for GFX10 NGG streamout.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Samuel Pitoiset a15b3bcf1a radv/gfx10: add an option to switch from legacy to NGG streamout
This internal option is turned off by default because NGG streamout
still hangs. It seems like it's related to GDS as RadeonSI.

That option will be turned on once all issues are resolved.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-16 12:08:22 +02:00
Lionel Landwerlin 6d5f11ab34 radv: store engine name
We'll use this later for a new driconfig matching parameter.

v2: Avoid leak in device creation error case (Bas)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Cc: 19.2 <mesa-stable@lists.freedesktop.org>
2019-09-15 15:37:02 +03:00
Samuel Pitoiset 8cf297c7b1 radv: do not pass all compiler options to the shader info pass
Only the pipeline layout and the shader keys are needed.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-10 09:26:42 +02:00
Samuel Pitoiset 83499ac765 radv: merge radv_shader_variant_info into radv_shader_info
Having two different structs is useless.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-09-06 15:52:03 +02:00
Samuel Pitoiset 021feb1bf6 ac: add rbplus_allowed to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:41 +02:00
Samuel Pitoiset 20c5db02b5 ac: add has_tc_compat_zrange_bug to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:36 +02:00
Samuel Pitoiset b55919cf2a ac: add has_gfx9_scissor_bug to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:32 +02:00
Samuel Pitoiset 2b9c371575 ac: add cpdma_prefetch_writes_memory to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:29 +02:00
Samuel Pitoiset b027ad66d7 ac: add has_out_of_order_rast to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:26 +02:00
Samuel Pitoiset ed720af46d ac: add has_load_ctx_reg_pkt to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:22 +02:00
Samuel Pitoiset 63c0b89b8f ac: add has_rbplus to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:19 +02:00
Samuel Pitoiset 44a46c09de ac: add has_dcc_constant_encode to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:16 +02:00
Samuel Pitoiset c08401f035 ac: add has_distributed_tess to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:11 +02:00
Samuel Pitoiset d62d2840c4 ac: add has_clear_state to ac_gpu_info
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-08-27 08:04:05 +02:00
Samuel Pitoiset 218ce34962 radv: add mipmap support for the clear depth/stencil values
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:59 +02:00
Samuel Pitoiset e36e260c42 radv: add mipmap support for the TC-compat zrange bug
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-26 15:56:55 +02:00
Bas Nieuwenhuizen d062bec48d radv: Hash Wave32 settings in shader key.
Can result in different shaders.

Fixes: 8a86908e9a "radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders"
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-12 13:32:18 +00:00
Bas Nieuwenhuizen 3a5950f501 radv: Add device argument for dcc compression check.
Because it is about to be generation dependent.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen 8c63ffe54d radv: Disable compression for compute DCC decompress store.
Previously we relied on stores not using DCC but that is going to
change, so disable compression explicitly.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen 216a9d8871 radv: Add extra struct to image view creation.
For extra args. Unlike image creation, I'm not embedding the vk
struct in there, so all the inline structs can be kept.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen 66131ceb8b radv: Pass through render loop detection to internal layout decisions.
And do nothing with it yet.

Everything outside a renderpass has no render loop.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen a171a6663d radv: Add render loop detection in renderpass.
VK spec 7.3:

"Applications must ensure that all accesses to memory that backs
image subresources used as attachments in a given renderpass instance
either happen-before the load operations for those attachments, or
happen-after the store operations for those attachments."

So the only renderloops we can have is with input attachments. Detect
these.

Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-08-07 02:13:07 +02:00
Bas Nieuwenhuizen a7041f3b4e radv: Store image view also outside framebuffer.
So we can use it with imageless framebuffers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 22:19:16 +02:00
Bas Nieuwenhuizen 49e6c2fb78 radv: Store color/depth surface info in attachment info instead of framebuffer.
That way we can use it for imageless framebuffers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 22:18:51 +02:00
Bas Nieuwenhuizen 72e7b7a00b ac/nir,radv: Optimize bounds check for 64 bit CAS.
When the application does not ask for robust buffer access.

Only implemented the check in radv.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-08-02 21:21:55 +02:00
Samuel Pitoiset e8110e51c6 radv: fix image_has_{cmask,fmask}() helpers
The driver should now rely on cmask_offset because CMASK can be
disabled by the driver for some reasons (eg. mipmaps). Apply the
same change for FMASK, although it should be useless.

Fixes: ad1bc8621d ("radv: remove radv_get_image_fmask_info()")
Fixes: 10d08da52c ("radv/gfx10: add missing dcc_tile_swizzle tweak")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 14:00:50 +02:00
Samuel Pitoiset ad1bc8621d radv: remove radv_get_image_fmask_info()
It's unnecessary to duplicate fields in another struct.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 13:34:46 +02:00
Samuel Pitoiset 9c9745e8dd radv: remove radv_get_image_cmask_info()
It's unnecessary to duplicate fields in another struct.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 13:34:41 +02:00
Samuel Pitoiset 8a86908e9a radv/gfx10: add Wave32 support for vertex, tessellation and geometry shaders
It can be enabled with RADV_PERFTEST=gewave32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 09:37:36 +02:00
Samuel Pitoiset 953bbacc23 radv/gfx10: add Wave32 support for fragment shaders
It can be enabled with RADV_PERFTEST=pswave32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-08-02 09:37:34 +02:00
Samuel Pitoiset ea38565011 radv/gfx10: add Wave32 support for compute shaders
It can be enabled with RADV_PERFTEST=cswave32.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-31 09:35:04 +02:00
Daniel Schürmann 45638e14fb radv: Don't include radv_private.h from radv_shader.h
This patch decouples radv_shader.h from any LLVM dependency.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-30 10:29:11 +02:00
Bas Nieuwenhuizen 4058b354c5 radv: Set FLUSH_ON_BINNING_TRANSITION.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-23 21:26:59 +02:00
Dave Airlie 2ac2b98780 radv: fix crash in shader tracing.
Enabling tracing, and then having a vmfault, can leads to a segfault
before we print out the traces, as if a meta shader is executing
and we don't have the NIR for it.

Just pass the stage and give back a default.

Fixes: 9b9ccee4d6 ("radv: take LDS into account for compute shader occupancy stats")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 11:00:25 +10:00
Samuel Pitoiset 24b1b1f574 radv: add an option for disabling NGG on GFX10
Will be useful for testing the legacy path.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-17 15:43:36 +02:00
Samuel Pitoiset ed53d2c4be radv/gfx10: disable the TC compat zrange workaround
Unnecessary.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-17 08:32:36 +02:00
Samuel Pitoiset ae4b1fc095 radv/gfx10: always build the GS copy shader but uses it on-demand
It should be possible to build it on-demand too but it requires
more work. On GFX10, the GS copy shader is required when tess
is enabled with extreme geometry.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-17 08:32:30 +02:00
Samuel Pitoiset 4dcdc4cdc5 radv: allow to select DST_SEL with RELEASE_MEM
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2019-07-16 11:16:57 +02:00
Samuel Pitoiset 5bbcb3f5bc radv/gfx10: implement support for GS as NGG
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-11 15:45:53 +02:00
Samuel Pitoiset ee21bd7440 radv/gfx10: implement NGG support (VS only)
This needs to be cleaned up a bit, and it probably contains
missing stuff and/or bugs.

This doesn't fix the "half of the triangles" issue.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:51:32 +02:00
Bas Nieuwenhuizen d0978427cb radv/gfx10: Use new uconfig reg index packet for GFX10+.
Otherwise the hardware/firmware seems to not set the registers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-07 17:51:32 +02:00
Samuel Pitoiset 2481ac81d3 radv/gfx10: implement radv_initialise_ds_surface()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:03:38 +02:00
Samuel Pitoiset e80f189de0 radv/gfx10: implement radv_initialise_color_surface()
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:03:38 +02:00
Samuel Pitoiset 3dc5ec5d16 radv/gfx10: generate gfx10_format_table.h
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-07 17:03:38 +02:00
Bas Nieuwenhuizen 726a31df70 radv: Add the concept of radv shader binaries.
This simplifies a bunch of stuff by
(1) Keeping all the things in a single allocation, making things easier
 for the cache.
(2) creating a shader_variant creation helper.

This is immediately put to use by creating rtld shader binaries. This
is the main reason for the binaries, as we need to do the linking at
upload time, i.e. post caching. We do not enable rtld yet.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-04 10:52:26 +00:00
Samuel Pitoiset 8ea7ee1536 radv: rename and re-document cache flush flags
SMEM and VMEM caches are L0 on gfx10. Ported from RadeonSI.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-25 18:38:37 +02:00
Samuel Pitoiset 5411f47056 radv: set DISABLE_CONSTANT_ENCODE_REG to 1 for Raven2
Ported from RadeonSI, will be emitted for GFX10 too.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-06-25 16:45:15 +02:00