Commit Graph

116131 Commits

Author SHA1 Message Date
Bas Nieuwenhuizen 041fc7beb8 radv: Derive android usage from create flags.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen 53b1372571 radv: Disallow sparse shared images.
Since we really cannot share them ever.

Also remove an unused switch.

Fixes: b70829708a "radv: Implement VK_KHR_external_memory"
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen f36b52740a radv/android: Add android hardware buffer queries.
Derived from the Intel code.

For the internal format we just use the internal Vulkan format,
as we have Vulkan formats for all android formats we care about.

For the ycbcr properties we just do something. I do not have a real
clue what would be recommended.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen a34e4dd0d2 radv/android: Add android hardware buffer field to device memory.
You cannot go from BO to Android hardware buffer, so for export we
have to remember it.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen 9ea72b5337 radv: Add VK_ANDROID_external_memory_android_hardware_buffer.
Still disabled but now we can add entrypoints.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen 4a495e1a85 radv: Unset vk_info in radv_image_create_layout.
For better test coverage of this corner case.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen 64768111c3 radv: Handle slightly different image dimensions.
The minigbm comment really says it all. We should
fix minigbm as well, but for now this is the more
robust solution.

Note that this only changes width and height for
the surface creation, not for the image and hence
also not for the sampler, where it would wreak
havoc due to the normalized coords.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen 852c64ca65 radv: Delay patching for imported images until layout time.
We want this flexibility because in GFX10 we lose any stride fields,
so we have to make sure our width/height are in alignment with
the external image we import.

Furthermore, we need the ability to inject tiling modifiers on import
time which is strictly after create time for Android. So, with the
layout & patch functions being fully independent of pCreateInfo, we
can delay it until import/bind time.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen 2ab4d418f9 radv: Split out layout code from image creation.
So we can delay the layout until later in some import cases.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen 825ddfee59 radv: Handle device memory alloc failure with normal free.
Less duplication/complexity.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Bas Nieuwenhuizen e1469c02cf radv: Cleanup buffer_from_fd.
Unused stride/offset args.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 17:02:34 +00:00
Tomeu Vizoso 6397dff6d7 gitlab-ci/lava: Test Lima driver with dEQP
Run dEQP on boards with Mali 400 and 450 in Baylibre's lab.

There's lots of skipped tests because of crashes and undetermined
behavior. May be a good idea to run the tests with valgrind and fix any
issues found.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
2019-10-10 14:50:14 +00:00
Tomeu Vizoso 8a168683d0 gitlab-ci/lava: Use files to list tests to skip
As the non-LAVA runner script does, have per-GPU version files listing
the tests that are to be skipped, due to being very slow, unstable, etc.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Neil Armstrong <narmstrong@baylibre.com>
2019-10-10 14:50:14 +00:00
Rafael Antognolli 01122a78b3 intel/tools: Support multiple contexts in intel_dump_gpu.
Create basic aub_context on GEM_CONTEXT_CREATE.

Set it up and submit a context + ring + pphwsp during execbuf
submission, if it has not been initialized yet.

v2: Write the HWSP only once per engine (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-10 14:08:50 +00:00
Rafael Antognolli 12feafc28e intel/tools: Add basic aub_context code and helpers.
v2:
 - Only dump context if there were no erros (Lionel).
 - Store counter for context handles in aub_file (Lionel).
v3:
 - Add a comment about aub_context -> GEM context (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-10 14:08:50 +00:00
Rafael Antognolli 472de61187 intel/tools: Use common code for GGTT address allocation.
We want to be able to create contexts on demand, and increase the GGTT
as needed for that. Use the aub_map_ggtt() function for that.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-10 14:08:50 +00:00
Rafael Antognolli 9968316ed0 intel/tools: Factor out GGTT allocation.
We want to reuse it in execlists_setup().

v2: Rename it to write_ggtt_ptes() (Lionel).
v3: Rename it to aub_map_ggtt() (Lionel).

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
2019-10-10 14:08:50 +00:00
Bas Nieuwenhuizen a9687c4e05 radv: Implement & enable VK_EXT_texel_buffer_alignment.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-10-10 13:24:16 +00:00
Samuel Pitoiset 9d17d97ee4 radv: use a compute shader for copying timestamp query results
When the timestamp is not ready (ie. UINT64_MAX), the availabily bit
should be zero. The previous code used to copy the timestamp value
as the availabily bit and that's completely wrong.

Because it's not that simple to emit a conditional with the CP, the
driver now uses a compute shader for copying timestamp query results.

Fixes dEQP-VK.pipeline.timestamp.misc_tests.reset_query_before_copy.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-10 13:23:22 +02:00
Samuel Pitoiset dad80eadb2 radv: sync before resetting query pools if timestamps have been written
Otherwise, the GPU might write timestamp queries after the reset
operation. This is similar to other query operations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-10 13:23:20 +02:00
Timur Kristóf aa75be05af aco: Clean up usages of PhysReg::reg from aco_assembler.
These are not needed anymore, since PhyReg has an implicit
conversion operator that can convert it to unsigned int,
which is equivalent to accessing this field.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf d729d8f1dc aco: Add extra assertion for number of FS input VGPRs.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf a89153d038 aco: Fix s_dcache_wb on GFX10.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry 68c9554732 aco: Have s_waitcnt_vscnt write to NULL.
Not sure if this instruction actually writes anything, but LLVM
disassembles a destination and sets it to NULL.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry 619f0a71cc aco: Use the VOP3-only add/sub GFX10 instructions if needed.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry 6a6bef59b0 aco: Initial work to avoid GFX10 hazards.
Currently just breaks up SMEM groups and fixes
FeatureVMEMtoScalarWriteHazard (name from LLVM).

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry d63c175897 aco: pad code with s_code_end on GFX10
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry 83993f535e aco: workaround GFX10 0x3f branch bug
According to LLVM, branches with an offset of 0x3f are buggy.

v2: (by Timur Kristóf)
- extract the GFX10 specific part to its own function

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf 0be1dd8564 aco: Fix VS input VGPRs on GFX10.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry c24cd97515 aco: Assemble opsel in VOP3 instructions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Rhys Perry 818bdab796 aco: Allow literals on VOP3 instructions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2019-10-10 09:57:53 +02:00
Timur Kristóf 7cf1dcf22d aco: Support subvector loops in aco_assembler.
These are currently not used, but could be useful later.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf 21f1953383 aco: Set GFX10 dimensionality on the instructions that need it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf eaa2a7cdf6 aco: Use ac_get_sampler_dim, delete duplicate code.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf 1de9ef9c96 aco: Set GFX10 DLC bit properly.
The DLC bit is now set to 1 for all loads when GLC is also set,
but cleared to 0 for all stores (otherwise it causes issues),
and also cleared to 0 for atomics.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf 89b074be86 aco: Support GFX10 VOP3 and VOP1 as VOP3 in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf d3a48c272f aco: Support GFX10 EXP in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf e6330d71b5 aco: Fix GFX9 FLAT, SCRATCH, GLOBAL instructions, add GFX10 support.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf 64d74ca816 aco: Support GFX10 MIMG and GFX9 D16 in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf c0df15e645 aco: Support GFX10 MTBUF in aco_assembler.
Also remove img_format from aco_ir, since it can be calculated
from dfmt and nfmt. So only the assember needs to deal with it.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:53 +02:00
Timur Kristóf e96124bd65 aco: Link ACO with amd/common.
We'd like to use some functions, for example some
ac_shader_util functions in ACO, so we need to link
ACO to AC.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf c57503b932 amd/common: Add extern "C" to some headers that were missing it.
We'd like to include some of these in C++ code later.
Specifically, ACO is written in C++ and we would like to use
some of this code in ACO in order to avoid code duplication.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf 9e27816252 aco: Support GFX10 MUBUF in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf 6106d4bce9 aco: Support GFX10 DS in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf bbe87eb6c3 aco: Support GFX10 VINTRP in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf b6235651b9 aco: Support GFX10 SMEM in aco_assembler.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf fd1d947457 aco: Add missing GFX10 specific fields and some README notes.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Timur Kristóf a01d796de4 aco: Set +wavefrontsize64 for LLVM disassembler in GFX10 wave64 mode.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
2019-10-10 09:57:52 +02:00
Alejandro Piñeiro fa41a51891 v3d: take into account prim_counts_offset
Specifically when reading the primitive counters.

This fixed ~700 CTS tests using this pattern:
dEQP-GLES3.functional.transform_feedback.*

when run after tests like
dEQP-GLES3.functional.prerequisite.read_pixels on the same
caselist. When run individually those tests were passing because
prim_counts_offset was zero.

Fixes: 0f2d1dfe65 ("v3d: use the GPU to
       record primitives written to transform feedback")

Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2019-10-10 09:51:50 +02:00
Samuel Pitoiset 42b2d1119a radv: get the device name from radeon_info::name
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-10-10 08:15:41 +02:00