Commit Graph

2106 Commits

Author SHA1 Message Date
Jason Ekstrand 32527f3ccc v3dv: Destroy the device mutex on the teardown path
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>
2022-04-13 17:22:14 +00:00
Jason Ekstrand 30191fd9df v3dv: Don't use pthread functions on c11 mutexes
This only works because c11/threads.h is typedeffing the c11 stuff to
ptrheads.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>
2022-04-13 17:22:14 +00:00
Jason Ekstrand 25441b5e5c v3dv: Put indirect compute CSD jobs in the job list
Instead of having the CPU job execute the CSD job, put both jobs on the
list with the CPU job first which modifies the GPU job which gets kicked
off next.  This gives the queue code more visibility into what types of
jobs are actually in the list.  In particular, if an indirect compute
job is the last job in a batch buffer, it currently appears as if the
batch ends with CPU work which isn't true because it kicks off GPU work.
In that case, the last job on the list is now a GPU job, which better
matches reality.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>
2022-04-13 17:22:14 +00:00
Jason Ekstrand 0208bb2d58 v3dv: Stop directly setting vk_device::alloc
vk_device_init() will do this.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>
2022-04-13 17:22:14 +00:00
Timothy Arceri 20ab7046c0 glsl/st: use nir pass to lower indirect rather than GLSL IR
Will allow us to drop more GLSL IR code in future once we switch
all drivers to NIR. Also stops the need for all drivers to call
this pass to remove indirect temps that may have been added during
the NIR varying linking lowering/optimisations.

This patch fixes some tests on i915, d3d12, lima and vc4.

Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15871>
2022-04-12 06:51:20 +00:00
Juan A. Suarez Romero 3ac7383843 ci: enable v3dv arm64 jobs
This reverts commit f567a832ee.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15856>
2022-04-11 15:26:58 +00:00
Mike Blumenkrantz f567a832ee ci: disable v3dv arm64 jobs
these have been broken for almost 48 hours

Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15841>
2022-04-10 12:25:39 -04:00
Iago Toral Quiroga 40f0c616e8 v3dv: fix bogus VkDrmFormatModifierProperties2EXT usage
The array is allocated for VkDrmFormatModifierPropertiesEXT, so
writring entried with type VkDrmFormatModifierProperties2EXT is
bogus.

It seems this was a mistake added with a change intended to get
rid of VK_OUTARRAY_MAKE, that changed the type of the write by
mistake.

Fixes: 56a2ccf058 ('v3dv: Stop using VK_OUTARRAY_MAKE()')

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15819>
2022-04-08 14:43:25 +00:00
Iago Toral Quiroga cf4b3cb563 broadcom/compiler: prefer reconstruction over TMU spills when possible
We have been reconstructing/rematerializing uniforms for a while, but we
can do this in more scenarios, namely instructions which result is
immutable along the execution of a shader across all channels.

By doing this we gain the capacity to eliminate TMU spills which not
only are slower, but can also make us drop to a fallback compilation
strategy.

Shader-db results show a small increase in instruction counts caused
by us now being able to choose preferential compiler strategies that
are intended to reduce TMU latency. In some cases, we are now also
able to avoid dropping thread counts:

total instructions in shared programs: 12658092 -> 12659245 (<.01%)
instructions in affected programs: 75812 -> 76965 (1.52%)
helped: 55
HURT: 107

total threads in shared programs: 416286 -> 416412 (0.03%)
threads in affected programs: 126 -> 252 (100.00%)
helped: 63
HURT: 0

total uniforms in shared programs: 3716916 -> 3716396 (-0.01%)
uniforms in affected programs: 19327 -> 18807 (-2.69%)
helped: 94
HURT: 50

total max-temps in shared programs: 2161796 -> 2161578 (-0.01%)
max-temps in affected programs: 3961 -> 3743 (-5.50%)
helped: 80
HURT: 24

total spills in shared programs: 3274 -> 3266 (-0.24%)
spills in affected programs: 98 -> 90 (-8.16%)
helped: 6
HURT: 0

total fills in shared programs: 4657 -> 4642 (-0.32%)
fills in affected programs: 130 -> 115 (-11.54%)
helped: 6
HURT: 0

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15710>
2022-04-08 05:37:28 +00:00
Jason Ekstrand 292ceb297c v3dv: Enable VK_EXT_debug_utils
It's implemented in common code as long as you use vk_command_buffer.

Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15560>
2022-04-06 01:18:23 +00:00
Omar Akkila 4208895175 ci: bump VK-GL-CTS to 1.3.1.1
Signed-off-by: Omar Akkila <omar.akkila@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15668>
2022-04-04 23:04:33 +00:00
Iago Toral Quiroga 827ef5fba9 v3dv: fix limits for inline uniform blocks
We don't support 'Update After Bind', however, the limits for this
model also include the ones without it. See the with or without remark
in the spec below:

"maxPerStageDescriptorUpdateAfterBindInlineUniformBlocks is similar to
 maxPerStageDescriptorInlineUniformBlocks but counts descriptor bindings
 from descriptor sets created with or without the
 VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT bit set."

Fixes:
dEQP-VK.api.info.vulkan1p2_limits_validation.ext_inline_uniform_block

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15732>
2022-04-04 09:28:55 +00:00
Iago Toral Quiroga 597560e27c broadcom/compiler: always enable per-quad on spill operations
This ensures that any channels used for helper invocations are
also spilled/filled correctly.

Alternatively, we could recursively track all temps that get
involved in computing values that are then used in explicit
(dfdx,dfdy) or implicit (texture coordinates for mipmap or
anisotropic filtering, etc) derivatives, and only enable
per-quad on these (or disable spilling of any of these
values).

Fixes:
dEQP-VK.graphicsfuzz.cov-dfdx-dfdy-after-nested-loops

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15705>
2022-04-01 08:53:50 +00:00
Jason Ekstrand 688d478045 v3dv/queue: Rework multisync_free
Thix fixes two bugs.  First, we stop leaking in/out fences with
multisync.  Because the in_syncs and out_syncs parameters to
set_multisync were arrays and not pointers to arrays, the caller's
in_syncs and out_syncs pointers never got set and remained NULL so
multisync_free() always sees to NULL pointers and does nothing, leaking
both arrays.  Not sure how this isn't showing up in the dEQP leak check
tests.

Second, the struct drm_v3d_multi_sync was in the scope of the then
clause of the `if (device->pdevice->caps.multisync)` so it goes out of
scope before the ioctl.  This is, effectively, a use-after-free and,
depending on stack allocation details, may result in the multisync
extension struct getting stompped before the ioctl.

Fixes: ff8586c345 ("v3dv: enable multiple semaphores on cl submission")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15512>
2022-03-29 14:38:41 +00:00
Iago Toral Quiroga 7f6ecb8667 v3dv: add reference counting for descriptor set layouts
The spec states that descriptor set layouts can be destroyed almost
at any time:

   "VkDescriptorSetLayout objects may be accessed by commands that
    operate on descriptor sets allocated using that layout, and those
    descriptor sets must not be updated with vkUpdateDescriptorSets
    after the descriptor set layout has been destroyed. Otherwise,
    descriptor set layouts can be destroyed any time they are not in
    use by an API command."

Based on a similar fix for RADV.

Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5893
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15634>
2022-03-29 11:28:39 +00:00
Iago Toral Quiroga ca861bd6f4 v3dv: drop unnecessary memset
We are already zeroing when we allocate the descriptor set layout
memory with vk_object_zalloc.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15634>
2022-03-29 11:28:39 +00:00
Iago Toral Quiroga 591eed30b2 v3dv: fix sampler array addressing in v3dv_descriptor_set_layout
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15634>
2022-03-29 11:28:39 +00:00
Alejandro Piñeiro 81039feda4 broadcom: update language on V3D_DEBUG options
Some typos, and bad grammar.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15593>
2022-03-28 19:21:48 +00:00
Iago Toral Quiroga ce849032a4 broadcom/compiler: allow ldunifa with indirect uniform loads
We handle uniforms by copying them into the uniform stream to be
consumed with ldunif when they have a constant offset. Otherwise
we fallback to general TMU access, which has more latency.

However, just like we did for UBOs and read-only SSBOs, we can
also try to use the unifa mechanism to handle indirect accesses
in certain cases instead of the TMU fallback.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15575>
2022-03-28 10:44:13 +00:00
Iago Toral Quiroga ea3223e7a4 v3dv: implement VK_EXT_inline_uniform_block
Inline uniform blocks store their contents in pool memory rather
than a separate buffer, and are intended to provide a way in which
some platforms may provide more efficient access to the uniform
data, similar to push constants but with more flexible size
constraints.

We implement these in a similar way as push constants: for constant
access we copy the data in the uniform stream (using the new
QUNIFORM_UNIFORM_UBO_*) enums to identify the inline buffer from
which we need to copy and for indirect access we fallback to
regular UBO access.

Because at NIR level there is no distinction between inline and
regular UBOs and the compiler isn't aware of Vulkan descriptor
sets, we use the UBO index on UBO load intrinsics to identify
inline UBOs, just like we do for push constants. Particularly,
we reserve indices 1..MAX_INLINE_UNIFORM_BUFFERS for this,
however, unlike push constants, inline buffers are accessed
through descriptor sets, and therefore we need to make sure
they are located in the first slots of the UBO descriptor map.
This means we store them in the first MAX_INLINE_UNIFORM_BUFFERS
slots of the map, with regular UBOs always coming after these
slots.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15575>
2022-03-28 10:44:13 +00:00
Boris Brezillon 56a2ccf058 v3dv: Stop using VK_OUTARRAY_MAKE()
We're trying to replace VK_OUTARRAY_MAKE() by VK_OUTARRAY_MAKE_TYPED()
so people don't get tempted to use it and make things incompatible with
MSVC (which doesn't support typeof()).

Suggested-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15522>
2022-03-25 11:00:02 +00:00
Jason Ekstrand 19f56e3fc4 v3dv: Drop GetPhysicalDeviceQueueFamilyProperties
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15459>
2022-03-18 11:19:14 -05:00
Iago Toral Quiroga 4f284254e4 v3dv: support importing external semaphores
This was waiting for multisync support in our kernel interface so
we can wait on the actual imported payload of a semaphore rather
than the last job we submitted.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>
2022-03-18 13:17:58 +00:00
Iago Toral Quiroga fa1b10f36d v3dv: lock around noop job submits
Any thread we create may end up creating/submitting at least a
noop job, which is a shared object. Before multisync, this was
an issue only for the creation of the job itself, but with
multisync we can also modify parameters of the noop job
every time it is used (for signaling and serialization
configuration).

This change adds a noop mutex that all threads (main, wait and
master) take before submitting a noop job to ensure concurrent
access is not an issue.

Fixes flakyness observed with multisync with the following test:
dEQP-VK.api.command_buffers.secondary_execute_twice

Reviewed-by: Melissa Wen <mwen@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>
2022-03-18 13:17:58 +00:00
Iago Toral Quiroga daa865fb2c v3dv: fix semaphore wait from CPU job
If a CPU job comes first in a command buffer with a semaphore wait operation
we need to wait on the CPU for the semaphore to be signaled before we process
the job.

We have been doing this with a WaitForIdle operation, but that only works
if the semaphore has been submitted for signaling from the same instance
of the driver. If we have an imported payload from another instance in our
semaphore however, waitForIdle may return too early since the submission
to signal the semaphore may have been submitted by a different instance
of the driver as well, and our wait for idle checks only know about this
instance submissions.

To fix this, we always submit a noop job from our instance that waits on
the semaphores on the GPU and follow up with WaitForIdle to wait for that
to complete.

Fixes test failures and/or assert crashes in:
dEQP-VK.synchronization.cross_instance.*
(when enabling support for semaphore imports)

Reviewed-by: Melissa Wen <mwen@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>
2022-03-18 13:17:58 +00:00
Iago Toral Quiroga 3b8ab8a9ce v3dv: don't signal semaphores/fences from a wait thread
When we have a wait thread we can't ensure that the last job in the last
command buffer will be the one to signal semaphores because in this case
there is no gurantee that jobs from command buffers in the batch will be
submitted to the GPU in order, as those put in a wait thread will be
submitted later when the event wait operation is completed.

Instead, we need to wait for all outstanding wait threads to complete
and only then we should signal any semaphores or fences.

This also fixes a bug where the wait for events was the last job in
the command buffer. In this case, once the event wait is completed
we have no additional jobs to submit and thus would never try to
signal semaphores or fences.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>
2022-03-18 13:17:58 +00:00
Iago Toral Quiroga 03840bfcd1 v3dv: fix temporary imports of semaphores and fences with multisync
This is preparatory work to expose support for importing semaphores, which
was waiting on kernel multisync support.

When we implemented user-space multisync support we didn't handle
temporary fence/semaphore payload imports at all, so we fix that here.

Also, we add a has_temp boolean flag to identify the case where we have
a temporary payload in a fence/sempahore instead of just checking if
temp_sync is not 0. This is necessary to support semaphore imports
(for which we are not exposing support yet) because these need to drop
the temporary payload when they are used as wait semaphores in a submit,
but we can't destroy the underlying temp_sync at that point because it
needs to survive at least until the submit is finished, so instead
we use a flag to tell if we have an active temporary payload or not,
and we simply destroy any temp_sync on a semaphore destroy or any new
import on the same semaphore. We only strictly need this flag for
semaphores because fences drop the temporary payload when they are
reset, which happens in the CPU and can only be done if the GPU is not
using the fence, but we add the same flag for the fence for consistency.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>
2022-03-18 13:17:58 +00:00
Iago Toral Quiroga 5a11a2fb6c v3dv: don't expose image load/store features for linear images
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>
2022-03-18 13:17:58 +00:00
Iago Toral Quiroga 0590ce1362 v3dv: return early on image to buffer blit copies if image is linear
This path uses a shader blit to implement the copy which is only
supported for tiled images (except 1D). While blit_shader() already
checks for this, this path does a lot of heavy lifting to prepare for
the blit_shader call so we rather avoid that if possible when we know
blit_shader won't be able to implement the blit.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>
2022-03-18 13:17:58 +00:00
Iago Toral Quiroga 397f4963ed v3dv: TFU destination must be UIF
We had some code that considered the possibility that the destination
might be linear when configuring TFU jobs, but we never actually allow
for this to happen since we avoid hitting these paths in that case, as
the TFU always produces UIF results. Instead, add an assert when
producing the TFU packet to ensure we are expecting a UIF result.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>
2022-03-18 13:17:58 +00:00
Alejandro Piñeiro e3d905ec39 v3dv/pipeline: use new helper vk_shader_module_to_nir
In addition to use the helper, we also remove some of the lowering we
had at preprocess_nir, as they are called now by the helper.

As we are here we also move the call to nir_lower_sysvals_to_varyings,
that for some reason we were calling it before preprocess_nir.

It is worth to note that with this change we lose the ability to debug
the NIR just after spirv_to_nir using V3D_DEBUG, as now this is done
on vk_spirv_to_nir, and as mentioned that includes several lowerings
now. The workaround to that is to use NIR_DEBUG.

We also needed to change how to check the entrypoint on the broadcom
compiler, checking just if it is an entrypoint, instead of assuming
that the name will be "main".

v2: tweak comment, squash v3dv and compiler change (Iago)

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15449>
2022-03-18 11:05:11 +00:00
Juan A. Suarez Romero 730a294b90 v3dv: implement VK_EXT_line_rasterization
Allow to choose the line rasterization algorithm. It supports
rectangular and Bresenham-style line rasterization.

v2 (Iago):
 - Update documentation.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15407>
2022-03-18 09:38:38 +00:00
Juan A. Suarez Romero 22759e9174 v3dv: add subpixel precision definition
Move number of bits for subpixel precision in rasterizer to a define.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15407>
2022-03-18 09:38:38 +00:00
Juan A. Suarez Romero b53dda6da8 broadcom: add line rasterization mode to packet definition
Add the supported line rasterization modes as enums in the XML packet
definition.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15407>
2022-03-18 09:38:38 +00:00
Juan A. Suarez Romero 102ae4bdc8 broadcom: add on-disk cache debug option
Add support for`V3D_DEBUG=cache`, which prints on-disk cache events.

v2:
 - Use same debug format for v3d and v3dv (Alejandro)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15380>
2022-03-18 08:58:01 +00:00
Iago Toral Quiroga 5c1302f47c v3dv: expose VK_EXT_image_drm_format_modifier
This has been implemented for a while but we could not expose it on
Vulkan 1.0 because the extension declares a dependency on
VK_KHR_sampler_ycbcr_conversion, which we don't implement, and
CTS would complain.

On Vulkan 1.1 however, VK_KHR_sampler_ycbcr_conversion was promoted
to core as an optional feature, and this is enough for the the
dependency to be satisfied, even if the feature is not supported,
meaning that we can now expose the extension.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15426>
2022-03-18 06:42:06 +00:00
Juan A. Suarez Romero c432bfe74b broadcom/ci: Update flake list
Some of the tests marked as flake didn't show up as flakes for a long
time (more than 3 months). So likely they are already fixed.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15411>
2022-03-17 13:56:41 +00:00
Juan A. Suarez Romero dfb6438392 v3dv: change MESA_GLSL_CACHE envvar reference
This was renamed to MESA_SHADER_CACHE.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15390>
2022-03-17 11:16:45 +01:00
Juan A. Suarez Romero 000b935c50 v3dv/ci: add test to skip list
Add test that it is a timeout in the CI, but otherwise it passes.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15374>
2022-03-14 18:55:13 +00:00
Tapani Pälli adea096029 ci: update various ci result files
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12936>
2022-03-11 09:58:28 +00:00
Iago Toral Quiroga 49b5431197 broadcom/compiler: remove unused functions
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15302>
2022-03-10 07:25:37 +00:00
Iago Toral Quiroga 44feff93c2 broadcom/compiler: don't always assign r5 if available
Instead, only favor assigning r5 if we have first decided to
assign an accumulator. This helps with assining r5 to short
lived uniforms, favoring accumulator rotation to facilitate
QPU merges.

total instructions in shared programs: 12656164 -> 12628339 (-0.22%)
instructions in affected programs: 5368373 -> 5340548 (-0.52%)
helped: 17420
HURT: 9996

total uniforms in shared programs: 3704776 -> 3704863 (<.01%)
uniforms in affected programs: 12247 -> 12334 (0.71%)
helped: 23
HURT: 78

total max-temps in shared programs: 2153505 -> 2152684 (-0.04%)
max-temps in affected programs: 26468 -> 25647 (-3.10%)
helped: 569
HURT: 328

total fills in shared programs: 4656 -> 4657 (0.02%)
fills in affected programs: 43 -> 44 (2.33%)
helped: 0
HURT: 1

total sfu-stalls in shared programs: 34728 -> 34403 (-0.94%)
sfu-stalls in affected programs: 3411 -> 3086 (-9.53%)
helped: 842
HURT: 534

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga 77f58b46d9 broadcom/compiler: add comment on why we don't use r5 with ldunifa
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga 5b140428b0 broadcom/compiler: adjust register threshold for 2-thread compiles
We have twice the registers in this case so it makes sense to double
this as well. While this causes slight regressions in shader-db
stats (due to additional register pressure), it helps us hide latency
of memory reads better on 2-thread compiles, where the thread switch
mechanism will be less effective. This shows a ~3% performance
improvement on the UE4 SunTemple demo.

total instructions in shared programs: 12642413 -> 12656164 (0.11%)
instructions in affected programs: 2272652 -> 2286403 (0.61%)
helped: 2924
HURT: 3389

total uniforms in shared programs: 3703861 -> 3704776 (0.02%)
uniforms in affected programs: 213729 -> 214644 (0.43%)
helped: 823
HURT: 1272

total max-temps in shared programs: 2150686 -> 2153505 (0.13%)
max-temps in affected programs: 191332 -> 194151 (1.47%)
helped: 1900
HURT: 1891

total spills in shared programs: 3255 -> 3274 (0.58%)
spills in affected programs: 166 -> 185 (11.45%)
helped: 3
HURT: 6

total fills in shared programs: 4630 -> 4656 (0.56%)
fills in affected programs: 367 -> 393 (7.08%)
helped: 7
HURT: 15

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga a35b47a0b1 broadcom/compiler: add a strategy to disable scheduling of general TMU reads
This can add quite a bit of register pressure so it makes sense to disable it
to prevent us from dropping to 2 threads or increase spills:

total instructions in shared programs: 12672813 -> 12642413 (-0.24%)
instructions in affected programs: 256721 -> 226321 (-11.84%)
helped: 719
HURT: 77

total threads in shared programs: 415534 -> 416322 (0.19%)
threads in affected programs: 788 -> 1576 (100.00%)
helped: 394
HURT: 0

total uniforms in shared programs: 3711370 -> 3703861 (-0.20%)
uniforms in affected programs: 28859 -> 21350 (-26.02%)
helped: 204
HURT: 455

total max-temps in shared programs: 2159439 -> 2150686 (-0.41%)
max-temps in affected programs: 32945 -> 24192 (-26.57%)
helped: 585
HURT: 47

total spills in shared programs: 5966 -> 3255 (-45.44%)
spills in affected programs: 2933 -> 222 (-92.43%)
helped: 192
HURT: 4

total fills in shared programs: 9328 -> 4630 (-50.36%)
fills in affected programs: 5184 -> 486 (-90.62%)
helped: 196
HURT: 0

Compared to the stats before adding scheduling of non-filtered
memory reads we see we that we have now gotten back all that was
lost and then some:

total instructions in shared programs: 12663186 -> 12642413 (-0.16%)
instructions in affected programs: 2051803 -> 2031030 (-1.01%)
helped: 4885
HURT: 3338

total threads in shared programs: 415870 -> 416322 (0.11%)
threads in affected programs: 896 -> 1348 (50.45%)
helped: 300
HURT: 74

total uniforms in shared programs: 3711629 -> 3703861 (-0.21%)
uniforms in affected programs: 158766 -> 150998 (-4.89%)
helped: 1973
HURT: 499

total max-temps in shared programs: 2138857 -> 2150686 (0.55%)
max-temps in affected programs: 177920 -> 189749 (6.65%)
helped: 2666
HURT: 2035

total spills in shared programs: 3860 -> 3255 (-15.67%)
spills in affected programs: 2653 -> 2048 (-22.80%)
helped: 77
HURT: 21

total fills in shared programs: 5573 -> 4630 (-16.92%)
fills in affected programs: 3839 -> 2896 (-24.56%)
helped: 81
HURT: 15

total sfu-stalls in shared programs: 39583 -> 38154 (-3.61%)
sfu-stalls in affected programs: 8993 -> 7564 (-15.89%)
helped: 1808
HURT: 1038

total nops in shared programs: 324894 -> 323685 (-0.37%)
nops in affected programs: 30362 -> 29153 (-3.98%)
helped: 2513
HURT: 2077

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga f783bd0d2a broadcom/compiler: define v3d-specific delays for NIR instructions
We do a few changes over NIR's defaults:

1. Lower delay for texture reads. Empirically, we don't observe any
   benefits with delays over 50 and since this delay value is still
   used by the scheduler in the "favor register pressure" case it is
   benefitial to avoid overestimating it too much.

2. Adjust delay for non-filtered TMU reads to the delay selected for
   texture reads.

3. In our case, UBO reads from dynamically uniform addresses don't
   use the TMU and have a latency of 1 instruction in the best case
   scenario or 4 at worse, so we go with 1 so we don't try to move
   this early.

This helps us get back some of what we lost when updating the
default scheduler configuration to add a delay for non-filtered
memory reads:

total instructions in shared programs: 13126587 -> 12671765 (-3.46%)
instructions in affected programs: 3764097 -> 3309275 (-12.08%)
helped: 14664
HURT: 4244

total threads in shared programs: 407208 -> 415522 (2.04%)
threads in affected programs: 8716 -> 17030 (95.39%)
helped: 4224
HURT: 67

total uniforms in shared programs: 3812698 -> 3711224 (-2.66%)
uniforms in affected programs: 335170 -> 233696 (-30.28%)
helped: 2816
HURT: 3551

total max-temps in shared programs: 2318430 -> 2159345 (-6.86%)
max-temps in affected programs: 539991 -> 380906 (-29.46%)
helped: 13173
HURT: 1440

total spills in shared programs: 49086 -> 5966 (-87.85%)
spills in affected programs: 48306 -> 5186 (-89.26%)
helped: 1655
HURT: 28

total fills in shared programs: 55810 -> 9328 (-83.29%)
fills in affected programs: 54821 -> 8339 (-84.79%)
helped: 1659
HURT: 22

LOST:   0
GAINED: 3

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga e7a4e97076 nir/schedule: use larger delay for non-filtered memory reads
This has been pending for a long time. It is not very consistent to
add a significant delay for textures and not do it for UBOs, etc
The reason we have not been doing this so far is the accumulated effect
on register pressure for V3D as shown by shader-db results below, but
from the point of view of a generic scheduler it makes sense to do this.

Later patches will address V3D specific issues with register pressure
derived from this by letting the driver control its instruction delay
settings.

total instructions in shared programs: 12662138 -> 13126587 (3.67%)
instructions in affected programs: 1813091 -> 2277540 (25.62%)
helped: 2410
HURT: 10499

total threads in shared programs: 415858 -> 407208 (-2.08%)
threads in affected programs: 17348 -> 8698 (-49.86%)
helped: 8
HURT: 4333

total uniforms in shared programs: 3711483 -> 3812698 (2.73%)
uniforms in affected programs: 128012 -> 229227 (79.07%)
helped: 3474
HURT: 2143

total max-temps in shared programs: 2138763 -> 2318430 (8.40%)
max-temps in affected programs: 318780 -> 498447 (56.36%)
helped: 588
HURT: 11997

total spills in shared programs: 3860 -> 49086 (1171.66%)
spills in affected programs: 709 -> 45935 (6378.84%)
helped: 23
HURT: 1595

total fills in shared programs: 5573 -> 55810 (901.44%)
fills in affected programs: 1067 -> 51304 (4708.25%)
helped: 23
HURT: 1595

LOST:   3
GAINED: 0

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:04 +00:00
Iago Toral Quiroga 9ef499b315 broadcom/compiler: stop moving UBO loads before NIR scheduling
This doesn't have any significant impact shader-db stats and would
reduce our capacity to hide latency from the loads, so it is probably
undesirable:

total instructions in shared programs: 12663189 -> 12663186 (<.01%)
instructions in affected programs: 4222 -> 4219 (-0.07%)
helped: 9
HURT: 4

total uniforms in shared programs: 3711624 -> 3711629 (<.01%)
uniforms in affected programs: 186 -> 191 (2.69%)
helped: 0
HURT: 2

total max-temps in shared programs: 2138822 -> 2138857 (<.01%)
max-temps in affected programs: 569 -> 604 (6.15%)
helped: 1
HURT: 9

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>
2022-03-09 15:53:03 +00:00
Timur Kristóf 64acec0ef9 nir: Fix lowering terminology of compute system values: "from"->"to".
This is to match other NIR terminology.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>
2022-03-08 17:36:31 +00:00
Iago Toral Quiroga f761f8fd9e broadcom/compiler: simplify node/temp translation during register allocation
Now that we don't sort our nodes we can arrange them so we can
easily translate between nodes and temps without a mapping table,
just applying an offset.

To do this we have a single array of nodes where twe put first the nodes
for accumulators and then the nodes for temps. With this setup we can
ensure that for any given temp T, its node is always T + ACC_COUNT.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga 871b0a7f6a broadcom/compiler: don't sort nodes for register allocation
Nodes are allocated in order to registers so initially sorting
was used to ensure that nodes with smaller life ranges would
be assigned first and therefore be more likely to get
accumulators.

However, since d81a6e5f1d now we don't rely on order to make
decisions about accumulators and instead we make policy decisions
based on actual liveness, so sorting is no longer strictly
relevant to this decision.

Furthermore, we are not re-sorting nodes after each spill either,
since that would probably require that we rebuild the interference
graph after each spill (the graph identifies nodes by their index).

Shader-db results show a significant improvement in instruction
counts, due to more optimal accumulator assignments. The reason for
this is that we use a round-robin policy for choosing the next
accumulator to assign. The idea behind this is preventing nearby
temps to be assigned to the same accumulator so that QPU scheduling
is more flexible, but if we  sort our nodes, we are basically not
assigning temps in program order any more and the round-robin policy
becomes less effective:

total instructions in shared programs: 13000420 -> 12663189 (-2.59%)
instructions in affected programs: 11791267 -> 11454036 (-2.86%)
helped: 62890
HURT: 19987

total threads in shared programs: 415874 -> 415870 (<.01%)
threads in affected programs: 20 -> 16 (-20.00%)
helped: 2
HURT: 4

total uniforms in shared programs: 3711652 -> 3711624 (<.01%)
uniforms in affected programs: 43430 -> 43402 (-0.06%)
helped: 134
HURT: 173

total max-temps in shared programs: 2144876 -> 2138822 (-0.28%)
max-temps in affected programs: 123334 -> 117280 (-4.91%)
helped: 4112
HURT: 1195

total spills in shared programs: 3870 -> 3860 (-0.26%)
spills in affected programs: 1013 -> 1003 (-0.99%)
helped: 14
HURT: 12

total fills in shared programs: 5560 -> 5573 (0.23%)
fills in affected programs: 1765 -> 1778 (0.74%)
helped: 14
HURT: 17

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga 4483cd24af broadcom/compiler: sink uniform loads
total instructions in shared programs: 13014428 -> 13000420 (-0.11%)
instructions in affected programs: 743624 -> 729616 (-1.88%)
helped: 1392
HURT: 611

total threads in shared programs: 415858 -> 415874 (<.01%)
threads in affected programs: 16 -> 32 (100.00%)
helped: 8
HURT: 0

total uniforms in shared programs: 3720410 -> 3711652 (-0.24%)
uniforms in affected programs: 113442 -> 104684 (-7.72%)
helped: 635
HURT: 29

total max-temps in shared programs: 2154268 -> 2144876 (-0.44%)
max-temps in affected programs: 61279 -> 51887 (-15.33%)
helped: 1124
HURT: 187

total spills in shared programs: 4002 -> 3870 (-3.30%)
spills in affected programs: 265 -> 133 (-49.81%)
helped: 6
HURT: 0

total fills in shared programs: 5788 -> 5560 (-3.94%)
fills in affected programs: 603 -> 375 (-37.81%)
helped: 6
HURT: 0

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga e228642cf5 broadcom/compiler: move constants before their first user
For us they are basically uniforms too so we want to make their
lifespans short to facilitate allocating them to accumulators.

total instructions in shared programs: 13043585 -> 13015385 (-0.22%)
instructions in affected programs: 8326040 -> 8297840 (-0.34%)
helped: 24939
HURT: 19894

total threads in shared programs: 415860 -> 415858 (<.01%)
threads in affected programs: 4 -> 2 (-50.00%)
helped: 0
HURT: 1

total uniforms in shared programs: 3721953 -> 3720451 (-0.04%)
uniforms in affected programs: 96134 -> 94632 (-1.56%)
helped: 744
HURT: 435

total max-temps in shared programs: 2173431 -> 2154260 (-0.88%)
max-temps in affected programs: 264598 -> 245427 (-7.25%)
helped: 10858
HURT: 841

total spills in shared programs: 4005 -> 4010 (0.12%)
spills in affected programs: 700 -> 705 (0.71%)
helped: 5
HURT: 10

total fills in shared programs: 5801 -> 5817 (0.28%)
fills in affected programs: 1346 -> 1362 (1.19%)
helped: 6
HURT: 11

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga a1998a9f43 broadcom/compiler: disallow TMU spills if max tmu spills is 0
If we are compiling with a strategy that does not allow TMU spills
we should not allow spilling anything that is not a uniform.
Otherwise the RA cost/benefit algorithm may choose to spill a
temp that is not uniform and that will cause us to immediately
fail the strategy and fallback to the next one, even if we
could've instead chosen to spill more uniforms to compile the
program successfully with that strategy.

Some relevant shader-db stats:

total instructions in shared programs: 13040711 -> 13043585 (0.02%)
instructions in affected programs: 234238 -> 237112 (1.23%)
helped: 73
HURT: 172

total threads in shared programs: 415664 -> 415860 (0.05%)
threads in affected programs: 196 -> 392 (100.00%)
helped: 98
HURT: 0

total uniforms in shared programs: 3717266 -> 3721953 (0.13%)
uniforms in affected programs: 12831 -> 17518 (36.53%)
helped: 6
HURT: 100

total max-temps in shared programs: 2174177 -> 2173431 (-0.03%)
max-temps in affected programs: 4597 -> 3851 (-16.23%)
helped: 79
HURT: 21

total spills in shared programs: 4010 -> 4005 (-0.12%)
spills in affected programs: 55 -> 50 (-9.09%)
helped: 5
HURT: 0

total fills in shared programs: 5820 -> 5801 (-0.33%)
fills in affected programs: 186 -> 167 (-10.22%)
helped: 5
HURT: 0

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Iago Toral Quiroga cbb4d0dded broadcom/compiler: increase cost of TMU spills to 10
Our cost was 5 which matches the number of instructions we have to
add for a TMU spill (a fill is 4 instructions).

Uniform spills on the other hand add an extra instruction for each
fill and remove one instruction for the spill itself. These have
a cost of 1.

Therefore, if we have a single spill+fill, we end up with +9
instructions if it is a TMU spill and +0 instructions with a uniform
spill, so making the former only 5 times more costly is probably
not a good idea, and this is without even considering the added
latency of the TMU accesses.

Relevant shader-db changes show this causes as a marginal instruction
count increase in a few shaders but better thread counts and lower
TMU spilling overall:

total instructions in shared programs: 13037315 -> 13040711 (0.03%)
instructions in affected programs: 370106 -> 373502 (0.92%)
helped: 187
HURT: 321

total threads in shared programs: 415090 -> 415664 (0.14%)
threads in affected programs: 574 -> 1148 (100.00%)
helped: 287
HURT: 0

total uniforms in shared programs: 3706674 -> 3717266 (0.29%)
uniforms in affected programs: 63075 -> 73667 (16.79%)
helped: 40
HURT: 395

total max-temps in shared programs: 2176080 -> 2174177 (-0.09%)
max-temps in affected programs: 15838 -> 13935 (-12.02%)
helped: 316
HURT: 34

total spills in shared programs: 4247 -> 4010 (-5.58%)
spills in affected programs: 2599 -> 2362 (-9.12%)
helped: 107
HURT: 14

total fills in shared programs: 6121 -> 5820 (-4.92%)
fills in affected programs: 3622 -> 3321 (-8.31%)
helped: 108
HURT: 13

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>
2022-03-02 08:09:11 +00:00
Guilherme Gallo d1c6185b5a ci: skqp: Add Vulkan support for a630_skqp job
This commit adds support for Vulkan backend on a630_skqp job.

= Needed changes
- Needed to install libvulkan-dev package on system
- Refactored the way the available skqp reports are printed
  tested in development builds with skia tools

Piglit expectations had to be updated in various drivers due to !14750 not
having bumped the tags when it tried to uprev.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14686>
2022-02-25 05:50:06 +00:00
Iago Toral Quiroga cf99584f51 broadcom/compiler: move uniforms right before their first use after scheduling
On V3D the quality of the code we generate is significantly affected by
how we decide to assign accumulators during register allocation, which
is determined by liveness, favoring short-lived temps.

There are many shaders that end up doing a whole lot of uniform loads
first, and using them later, which is very inconvenient for our register
allocation process because this increases uniform liveness and causes
us to use accumulators less efficientely, leading to significant churn.

To fix this, we move uniforms right before their first use in the same
block, but we need to do this after NIR scheduling, which means we are
doing it in non-SSA form, since the scheduler has a tendency to undo
this optimization and it is not easy to modify it to avoid it, since it
works in more abstract terms, using instruction dependencies, estimated
register pressure and instruction delay information to do its work,
which are very different concepts.

total instructions in shared programs: 13316738 -> 13033613 (-2.13%)
instructions in affected programs: 10389172 -> 10106047 (-2.73%)
helped: 55442
HURT: 16144

total threads in shared programs: 413722 -> 415048 (0.32%)
threads in affected programs: 1428 -> 2754 (92.86%)
helped: 680
HURT: 17

total loops in shared programs: 1716 -> 1690 (-1.52%)
loops in affected programs: 26 -> 0
helped: 26
HURT: 0

total uniforms in shared programs: 3704313 -> 3705181 (0.02%)
uniforms in affected programs: 687730 -> 688598 (0.13%)
helped: 2920
HURT: 7384

total max-temps in shared programs: 2364785 -> 2175190 (-8.02%)
max-temps in affected programs: 1215387 -> 1025792 (-15.60%)
helped: 49667
HURT: 1556

total spills in shared programs: 4241 -> 4248 (0.17%)
spills in affected programs: 642 -> 649 (1.09%)
helped: 11
HURT: 19

total fills in shared programs: 6115 -> 6125 (0.16%)
fills in affected programs: 1276 -> 1286 (0.78%)
helped: 11
HURT: 21

total sfu-stalls in shared programs: 34381 -> 36578 (6.39%)
sfu-stalls in affected programs: 16055 -> 18252 (13.68%)
helped: 3647
HURT: 5206

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>
2022-02-24 11:36:00 +00:00
Iago Toral Quiroga c4a78a2d2a broadcom/compiler: fix register class patching for postponed spills
If we have a postponed spill, the temp we create at ip is no longer
the spilled temp and therefore is affected by the thrsw injection.

Fixes corruption in the additive blending animation demo from
Three.js.

Fixes: f3c3228522 ('broadcom/compiler: do not rebuild the interference graph after each spill')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15112>
2022-02-22 11:17:10 +00:00
Erik Faye-Lund 25a37fabb7 vulkan/wsi: untangle buffer-images from prime
Not all Vulkan implementations allows rendering to linear images, so in
order to support scanning out from these on Windows we might have to copy
through a buffer like we do in the PRIME path.

To avoid reimplementing the same, let's instead generalize the code a
bit so it doesn't have to specfy any PRIME-specific details.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12210>
2022-02-22 10:04:34 +00:00
Iago Toral Quiroga a4b164b57b broadcom/compiler: only patch temps that existed before the current spill
When we spill we add new temps. We should be careful not to access
liveness for these until we have re-computed it after all spills and
fill for that the spilled temp have been processed so as to avoid
out-of-bounds accesses to the c->temp_start and c->temp_end arrays.

This fixes a crash in a Three.js demo when we try to patch register
classes after a TMU spill that was caused because we would incorrectly
try to patch the same temps we had just added for the spill itself,
which is not only unnecessary but also incorrect since we these temps
would not have liveness information available yet and thus would
cause out of bounds accesses.

Fixes: f3c3228522 ('broadcom/compiler: do not rebuild the interference graph after each spill')
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15107>
2022-02-22 06:41:51 +00:00
Jose Maria Casanova Crespo 90f966e05f v3dv/v3d: Fix copyright holder to Raspberry Pi Ltd
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15057>
2022-02-18 11:50:07 +01:00
Juan A. Suarez Romero bfdb1064c5 vc4/ci: make piglit test mandatory
Make piglit test jobs to run always, as piglit testsuite offers more
coverage for the VC4 driver.

On the other hand, make the EGL testing manually, as we don't have
enough devices to execute all the tests fast enough.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15045>
2022-02-18 09:02:55 +00:00
Iago Toral Quiroga 750eeecf4e broadcom/compiler: document that spill_base is used for spills and scratch
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>
2022-02-18 08:38:19 +00:00
Iago Toral Quiroga 8883975209 broadcom/compiler: drop spill_count and add spilling boolean
We added spill_count to handle uniform batch spills, which we no longer do.
What we want now is a way to know if we are spilling registers.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>
2022-02-18 08:38:19 +00:00
Iago Toral Quiroga f3c3228522 broadcom/compiler: do not rebuild the interference graph after each spill
Instead, we only recompute liveness and we add new nodes and
interferences to the graph manually (we also need to patch
register classes in some cases).

To assist in this process, we also add an ip counter to our
instructions that we also recompute after each spill, which we use
to identify registers that cross thrsw boundries introduced with
TMU spills and fills and adjust their register classes accordingly
(removing their capacity to use accumulators).

This significantly reduces the CPU cost of spills. Using
shaders/closed/gputest/piano/7.shader_test as reference:

Compile time up to the first successful compile strategy in main is
~24s and with this change it is ~11s. With this speed up, we can now
try all 2-thread compile strategies (including the fallback scheduler)
in only ~15s.

A full shader-db run results in:
Total CPU time (seconds): 9904.67 -> 9087.98 (-8.25%)

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>
2022-02-18 08:38:19 +00:00
Iago Toral Quiroga 59caaa7fb3 broadcom/compiler: reset spill/fill counts after lowering thread count.
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>
2022-02-18 08:38:19 +00:00
Iago Toral Quiroga 92d819aaa0 broadcom/compiler: fix end of TMU sequence check
We may be pipelining TMU writes and reads, in which case we can
see both TMUWT and LDTMU at the end of a TMU sequence, so we should
not assume that a TMUWT always terminates a sequence.

Also, we had a bug where we were using inst instead of scan_inst
to check if we find another TMUWT after the curent instruction.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>
2022-02-18 08:38:19 +00:00
Iago Toral Quiroga 40e091267d broadcom/compiler: define max number of tmu spills for compile strategies
Instead of whether they are allowed to spill or not. This is more flexible.
Also, while we are not currently enabling spilling on any 4-thread strategies,
should we do that in the future, always prefer a 4-thread compile.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>
2022-02-18 08:38:19 +00:00
Iago Toral Quiroga 919aedbfec broadcom/compiler: choose compile strategy with lowest spilling
Until now we would only allow spilling as a last resort in the
last 2 strategies, however, it is possible that in some cases
earlier strategies may produce less spills if we allowed spilling
on them.

Likewise, the fallback scheduler can sometimes produce less spills
than 2 threads with optimizations disabled.

With this change, we start allowing all our 2-thread strategies to
spill, and instead of choosing the first strategy that is successful,
we choose the one that doesn't spill or the one with the least amount
of spilling.

It should be noted that this may incur in a significant increase
of compile times. We will address this in a follow-up patch.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>
2022-02-18 08:38:19 +00:00
Jason Ekstrand 05e9e7767d vulkan: Rename vk_image_view::format to view_format
When I originally added vk_image_view, I was overly clever when it came
to the format field.  I decided to make it only contain the bits of the
format contained in the selected aspects.  However, this is confusing
(not generally a good thing) and it's also not always what you want.
The Vulkan 1.3.204 spec says:

    "When using an image view of a depth/stencil image to populate a
    descriptor set (e.g. for sampling in the shader, or for use as an
    input attachment), the aspectMask must only include one bit, which
    selects whether the image view is used for depth reads (i.e. using a
    floating-point sampler or input attachment in the shader) or stencil
    reads (i.e. using an unsigned integer sampler or input attachment in
    the shader). When an image view of a depth/stencil image is used as
    a depth/stencil framebuffer attachment, the aspectMask is ignored
    and both depth and stencil image subresources are used."

So, while the restricted format makes sense for texturing, it doesn't
for when the image is being used as an attachment.  What we probably
actually want is both versions of the format.  We'll call the one given
by the VkImageViewCreateInfo vk_image_view::format and the restricted
one vk_image_view::view_format.

This is just the first commit which switches format to view_format so
the compiler will make sure we get them all.  The next commit will
re-add vk_image_view::format but this time unmodified.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15007>
2022-02-16 00:14:50 +00:00
Juan A. Suarez Romero 801d33b2d9 vc4/ci: update failing piglit tests
See https://gitlab.freedesktop.org/mesa/mesa/-/issues/6038.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15012>
2022-02-15 09:21:23 +01:00
Jason Ekstrand 29393a40ee v3dv: Use the common command pool implementation
The only interesting information stored in v3dv_cmd_pool is the list of
command buffers and that's already tracked by vk_command_pool.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>
2022-02-11 08:06:25 +00:00
Jason Ekstrand fcad979b72 v3dv: Don't use vk_alloc/free2 for command buffers
The pool will always have a valid allocator.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>
2022-02-11 08:06:25 +00:00
Jason Ekstrand bda4c4f6b6 vulkan: Take a vk_command_pool in vk_command_buffer_init()
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>
2022-02-11 08:06:25 +00:00
Jason Ekstrand 6fb9e4e7ff v3dv: Use vk_command_pool
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>
2022-02-11 08:06:24 +00:00
Louis-Francis Ratté-Boulianne 5e263cc324 vulkan/runtime: Add a level field to vk_command_buffer
Looks like 3 implementations already have that field in their private
command_buffer struct, and having it at the vk_command_buffer opens the
door for generic (but suboptimal) secondary command buffer support.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>
2022-02-11 08:06:24 +00:00
Juan A. Suarez Romero 7955df28a6 v3dv/ci: Update failure list
Add more failing tests to the expected failures.

These are obtained after executing the full Vulkan CTS.

v2:
 - Add comments in the tests (Alejandro)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14948>
2022-02-09 15:46:23 +00:00
Melissa Wen 70a219d4a3 broadcom/simulator: enable multisync in the simulator
Use drmSyncobjSignal to signal out_syncobjs when a GPU job submission
ends in the simulator. With this, we can enable multisync support in the
simulator and keep the multisync approach to process fence by submitting
a serialized no-op job that adds the fence to the array of out syncobjs,
i.e.  syncobjs to be signaled in the kernel when a job completes (job
post deps).

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14768>
2022-02-09 07:22:42 +00:00
Emma Anholt ef112db311 ci: Bump VK-GL-CTS to 1.3.1.0.
The main thing is VK 1.3 testing, but also includes test bugfixes.  The
1.3 CTS required an uprev of deqp-runner to handle a new style of test
output, and that deqp-runner brings in some neat new features, too (piglit
in your deqp-runner suite, and extension list checking).

A bunch of VK tests got renamed, so I replaced panvk's custom test list
with simple include filters on the main test list.

Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (panvk)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14920>
2022-02-08 22:16:36 +00:00
Emma Anholt 3f34251495 ci/broadcom: Remove unused v3dv xfails file.
It's actually in broadcom-rpi4-fails.txt.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14920>
2022-02-08 22:16:35 +00:00
Dylan Baker 2f916f2be6 meson: add support for `meson devenv` with vulkan
Meson devenv is a feature added in meson 0.58 (thus the features is
version guarded) that allows creating a shell environment with
environment variables automatically setup for running the project inside
the build dir. Some variables (such as LD_LIBRARY_PATH and PATH) are set
automatically, others must be added by the project.

For vulkan is is relativley simple, we create a new, uninstalled, icd
file for each driver and set the VK_ICD_FILENAMES variable
appropriately. This can be used with:

```sh
meson devenv -C $builddir
```

then, vulkan applications will automatically use the uninstall vulkan
driver, no need to install.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14826>
2022-02-04 09:08:47 -08:00
Emma Anholt a177f0de8f ci: Uprev vulkan-cts to 1.2.8.0
This brings in some interesting new vulkan tests and fixes for the
spurious KHR-GL TF failures.  Also, reduces the runtime of
dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.36 so that it
should stop timing out.

Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13779>
2022-02-03 22:41:23 +00:00
Alejandro Piñeiro 5d8c659678 v3d/drm-shim: remove drm-shim driver
After starting to use a new version of the simulator, it got
outdated.

We made some initial effort to update it, but it was not
working. Taking into account that no one is using it, it is better to
just remove it.

We keep the noop drm drivers, as they could have some value for
developers that doesn't have access to the v3dv3 simulator.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14682>
2022-02-03 09:53:29 +00:00
Iago Toral Quiroga 7561ea8fa1 broadcom/compiler: allow ldunifa with read-only SSBOs
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14830>
2022-02-03 07:35:07 +00:00
Iago Toral Quiroga 0a8449b07c broadcom/compiler: fix offset alignment for ldunifa when skipping
The intention was to align the address to 4 bytes (32-bit), not
16 bytes.

Fixes: bdb6201ea1 ("broadcom/compiler: use ldunifa with unaligned constant offset")

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14830>
2022-02-03 07:35:07 +00:00
Iago Toral Quiroga ce99b1a746 v3dv: don't submit noop job if there is nothing to wait on or signal
Also, do not unconditionally flag signaling for submits without any
command buffers.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14802>
2022-02-01 07:28:46 +00:00
Melissa Wen db9098f2ef v3dv: move sems_info from event_wait job to wait_thread info
Semaphores info was stored as an info of event_wait cpu jobs and this
leads to mem leak when the same event_wait job in the same cmd buffer
batch was submitted more than once. As a result,
`dEQP-VK.api.command_buffers.record_simul_use_primary` fails due to a
double-free of sems_info.

In this patch, we no longer use v3dv_event_wait_cpu_job_info to store
semaphores from a submit info, since semaphores is related to a queue
submission and not to the event_wait job type. If we spawn a wait_thread,
we copy semaphores to an auxiliary struct (v3dv_wait_thread_info) that
will be used in wait_thread to get job and semaphores information. When
the spawned thread finishes, it releases the related
v3dv_wait_thread_info and the semaphores copy as well.

Fixes: d5bd18fb ("v3dv: store wait semaphores in event_wait_cpu_job_info")

Suggested-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Melissa Wen <mwen@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14736>
2022-01-31 23:01:54 +00:00
Thomas H.P. Andersen 430b1157a1 broadcom: drop unused functions
Fixes a clang warning about unused static inlined functions

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14790>
2022-01-31 16:10:31 +00:00
Iago Toral Quiroga 5974949c0d v3dv: expose VK_KHR_depth_stencil_resolve
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14752>
2022-01-28 12:25:43 +00:00
Iago Toral Quiroga 668653f830 v3dv: fallback to blit resolve if render area is not aligned to tile boundaries
Just as with all other TLB operations, we can only use the TLB if the render
area is aligned to tile boundaries. If it is not, then the operation would
overwrite pixels outside the render area, which is not allowed.

In this case, we can't even emit a previous TLB load to fix this because the
TLB has the multisampled attachment, not the resolve attachment, which is
just a destination buffer for the tile store.

Because the condition for tile alignment has to be determined for each
subpass, we handle this by storing this information in the attachment
state of the command buffer with the start of each subpass. We store
whether the attachment is to be resolved and whether it can use the
TLB (considering tile alignment restrictions).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14752>
2022-01-28 12:25:43 +00:00
Iago Toral Quiroga 7f87a1256e v3dv: support resolving depth/stencil attachments
This implements the bulk of VK_KHR_depth_stencil_resolve

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14752>
2022-01-28 12:25:43 +00:00
Vinson Lee a97ec3eb13 v3dv: Add missing unlocks on errors.
Fix defects reported by Coverity Scan.

Missing unlock (LOCK)
missing_unlock: Returning without unlocking.

Fixes: a7052dcf2c ("v3dv: enable multiple semaphores for csd job")
Fixes: ad09e50129 ("v3dv: enable multiple semaphores for tfu job")
Fixes: ff8586c345 ("v3dv: enable multiple semaphores on cl submission")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14663>
2022-01-28 04:15:24 +00:00
Iago Toral Quiroga 764c8867b0 v3dv: document why we don't expose VK_EXT_scalar_block_layout
And since this is an optional feature in Vulkan 1.2, fill in the
corresponding feature query while we are at it.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14731>
2022-01-27 07:34:19 +00:00
Iago Toral Quiroga 06220a28e7 v3dv: rework Vulkan 1.2 feature queries
Fill them into a VkPhysicalDeviceVulkan12Features struct like we
do for Vulkan 1.1, and then read them from there.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14731>
2022-01-27 07:34:19 +00:00
Iago Toral Quiroga 692e0dfe27 v3dv: implement VK_KHR_imageless_framebuffer
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14704>
2022-01-27 07:11:20 +00:00
Iago Toral Quiroga 2ee9487ad7 v3dv: drop signature of undefined function
This is a left over from when we added multi-version support in the
driver, where we turned this helper into a versioned scheme.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14704>
2022-01-27 07:11:20 +00:00
Emma Anholt d041630a37 ci/llvmpipe,softpipe: Switch piglit testing to piglit-runner.
The new runner reduces the runtime by about 1/3 thanks to using rust
instead of python, and includes automatic flake handling so you don't just
have to skip flaky tests.  The wrapper script also includes IRC flake
reporting (so one can track and update the flakes list to improve CI
reliability), always uploading results to CI for review (so you can
diagnose flakes and look at timings), has a prettier regressions report
and a helpful timing report, and is the same as what's used by all the HW
runners as well.

The downside is that by dropping the massive list of skips, you no longer
get flagged if Mesa refactors end up accidentally disabling extensions and
thus making tests skip.  For that, I've started on
https://gitlab.freedesktop.org/anholt/deqp-runner/-/merge_requests/33 so
that hardware drivers get extension checking coverage too.

Thanks to the perf improvement, we get to drop one of the jobs for
llvmpipe.

xfail lists were mostly sed-jobs from the prior expectations lists.  The
exceptions to that you'll find in the form of whitespace around the
affected test group (usually changes of capitalization or
special-characters), or an explanation for the more interesting changes
(which thankfully we can now record in the xfails lists!).

Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14604>
2022-01-27 04:37:16 +00:00
Iago Toral Quiroga f666f70935 v3dv: support VK_KHR_8bit_storage
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 5cec893384 broadcom/compiler: update comment on load_uniform fast-path
The comment for 16-bit applies to 8-bit uniforms as well.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 296fde31aa broadcom/compiler: allow vectorization to larger scalar type
Allow to vectorize operations from a smaller bit-size into
scalar operations of a larger bit-size. This allows us to
turn 2x8-bit into a equivalent scalar 16-bit load/store.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga a248ff0b5b broadcom/compiler: support 8-bit loads via ldunifa
This generalizes the support we added for 16-bit to also handle
8-bit loads via ldunifa. The story is the same: we align the address
to 32-bit downwards and we skip any bytes that are not of interest.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 4630f5f016 broadcom/compiler: handle to/from 8-bit integer conversions
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 1b530d948d broadcom/compiler: support 8-bit general store access
Just like with 16-bit, this mode only supports scalar access, but
we are already lowering all non 32-bit accesses to scalar.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 84adf89d33 v3dv: expose storagePushConstant16 feature from VK_KHR_16bit_storage
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga f7ff462421 broadcom/compiler: support 16-bit uniforms
Since ldunif is a 32-bit instruction we need to demote these to
UBO loads, like we do for indirect indexing, with the exception
of scalar 16bit uniforms with an offset that is 32-bit aligned.

For the exception where we can use lfdunif we read a 32-bit slot
from memory where the uniform data is in the lower 16-bit and we
will read garbage in the upper 16-bit which we won't use anyway.

It should be noted that by using ldunif, we are consuming
32-bit from the uniform stream, but this is fine because
if there is valid uniform data in the upper 16-bit (i.e.
we had a ivec2 uniform aligned to a 32-bit address), since
we scalarize 16-bit loads, we would see another load uniform
with an unaligned offset for the second component, which we
will demote to UBO.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 4f26f50ae4 v3dv: support VK_KHR_16_bit_storage
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 49a8fa152c broadcom/compiler: support f32 to f16 RTZ and RTE rounding modes
These are required by VK_KHR_16bit_storage. Our hardware, however,
doesn't provide any mechanism to decide on the rounding mode of
the conversion and it seems to be using RTE, so we implement
RTZ in software.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 1f639d5310 broadcom/compiler: implement 32-bit/16-bit conversion opcodes
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga bdb6201ea1 broadcom/compiler: use ldunifa with unaligned constant offset
If we know we have a load with a constant offset, then even if it
is not aligned to 32-bit we can still produce an aligned offset
and then skip over the bytes we don't need.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 2eb6910d96 broadcom/compiler: support ldunifa with some 16-bit loads
Even though ldunifa is strictly 32-bit we may be able to use it
to load 16-bit values that sit at 32-bit aligned addresses.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 2a420bdf92 broadcom/compiler: lower packing after vectorization
The vectorization pass can inject 32_2x16 (un)packing opcodes
upon successful vectorization of 16-bit operations into 32-bit
counterparts, so make sure we lower these to something our
backend can handle.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 4b24373137 broadcom/compiler: implement TMU general 16-bit load/store
This allows us to implement 16-bit access on uniform and
storage buffers.

Notice that V3D hardware can only do general access on scalar
16-bit elements, which we currently enforce by running a lowering
pass during shader compile.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 2443e45e76 broadcom/compiler: better document vectorization implications
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga 765d9feb46 broadcom/compiler: add lowering pass to scalarize non 32-bit general load/store
V3D hardware doesn't support vector access for general TMU load/store
operations like the ones we use for UBO and SSBO, so we need to split
these to scalar operations.

It should be noted that we also have a vectorization pass (which runs
later, during optimization), that may reconstruct some of these into
32-bit operations when possible (i.e. when the resulting operation
is 32-bit aligned).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>
2022-01-25 09:08:26 +00:00
Iago Toral Quiroga a6aa35a091 v3dv: implement VK_KHR_driver_properties
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14680>
2022-01-24 13:56:39 +00:00
Iago Toral Quiroga be11948a95 broadcom/simulator: handle DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT
Without this the simulator wrapper will abort upon seeing this
query, rendering the driver unusable in that context.

Also, it seems the simulator environment doesn't quite work with
multisync at present, so do not enable it until we figure out what
the issue is.

Reviewed-by: Melissa Wen <mwen@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14678>
2022-01-24 11:34:00 +00:00
Alejandro Piñeiro 275a18322d v3dv: check correct format when load/storing on a depth/stencil buffer
When we create a image view with D24S8 format we made a reformatting
to RGBA8UI if the aspect selected is just STENCIL. But when
configuring the stores we select the aspects based on the attachment
format. Quoting from cmd_buffer_render_pass_emit_stores:

      /* From the Vulkan spec, VkImageSubresourceRange:
       *
       *   "When an image view of a depth/stencil image is used as a
       *   depth/stencil framebuffer attachment, the aspectMask is ignored
       *   and both depth and stencil image subresources are used."
       *
       * So we ignore the aspects from the subresource range of the image
       * view for the depth/stencil attachment, but we still need to restrict
       * the to aspects compatible with the render pass and the image.
       */
      const VkImageAspectFlags aspects =
         vk_format_aspects(ds_attachment->desc.format);

So we could ending trying to store on a Z+Stencil buffer, using a
RGBA8UI format.

So far this only affected some tests when using the simulator
(assert). Those were working on the real hw, but probably would fail
on other situations, so lets use the original image format on that
case.

v2 (Iago)
   * Improve comment grammar
   * Do the same on load too (not just store)

v3 (Iago)
    * Re-word comments.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14635>
2022-01-21 13:24:18 +00:00
Alejandro Piñeiro 5d04b76c09 v3dv: remove unused v3dv_descriptor_map_get_texture_format
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14635>
2022-01-21 13:24:18 +00:00
Melissa Wen 9319ffb53d v3dv: signal fence when all submitted jobs complete execution
We track last submitted jobs by queue type. After all cmd buffer
batches have been submitted, we emit a noop job that waits all jobs
submitted to each GPU queue complete and signals the fence.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Melissa Wen bce77e758a v3dv: process signal semaphores in the very last job
With multiple semaphores support, we can use a GPU job to handle
multiple signal semaphores in the end of a cmd buffer batch. It
means, the last job in the last cmd buffer will be in change of
signalling semaphores as long as it meets some conditions:
1 - A GPU-job signals semaphores whenever we only have submitted
jobs for the same queue (there is no syncobj created for any
other type). Otherwise, we emit a noop job that waits on the
completion of all jobs submitted and then signals semaphores.
2 - A CPU-job is never in charge of signalling semaphores. We
process it first and emit a noop job that depends on all jobs
previously submitted to signal semaphores.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Melissa Wen 0ab98612ef v3dv: handle wait semaphores in the first job by queue
With multiple semaphore support, we can improve the way we handle
wait semaphores considering different job types and cmd buffer
batch scenarios, that means:

- A GPU job depends on wait semaphores whenever it is the first job
submitted to a queue in this command buffer batch (the `first` flag
for the job's queue type is set).
- For the first CPU job, if there are wait semaphores, we should
wait for the CPU and GPU being idle to process the job.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Melissa Wen 03a6a82740 v3dv: track submitted jobs by GPU queue type
The order in which a GPU job is scheduled is guaranteed within the
same queue type (CL, TFU, CSD), but the order of completion of jobs
from different queues cannot be guaranteed. Since we have multiple
semaphores support now, we can track the completion of the last job
submitted to each queue and therefore better determine when gpu is
idle. We do it using an array of syncobj (last_job_syncs) for each
GPU queue (CL, TFU, CSD). With this, job serialization also become
more accurate. We also keep tracking the very last job submitted
(last_job_sync became an element of the last_job_syncs array as
V3DV_QUEUE_ANY) for the case we don't have multisync support.
To help in handling wait semaphores, we set a flag per queue to
indicate we are starting a new cmd buffer batch and a job submitted
to this queue will be the first.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Melissa Wen fd973218a6 v3dv: enable GPU jobs to signal multiple semaphores
In addition to keep a copy of wait semaphores, extend
v3dv_submit_info_semaphores to hold a copy of signal semaphores too.
With a copy of wait and signal semaphores, we can enable GPU jobs to
handle more than one wait and signal semaphores.

By now, we don't change the way as we signalling semaphores when all
jobs complete, i.e., we still use the master thread to signal
semaphores. In this context, no GPU job is actually in charge of
signalling, but the support for multiple signal semaphores is done
here.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Melissa Wen a7052dcf2c v3dv: enable multiple semaphores for csd job
Whenever v3d kernel-driver supports multisync extension, use it to
allow add multiple semaphores as csd job dependency.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Melissa Wen ad09e50129 v3dv: enable multiple semaphores for tfu job
Whenever v3d kernel-driver supports multisync extension, use it to
enable more than one semaphores in a tfu job.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Melissa Wen ff8586c345 v3dv: enable multiple semaphores on cl submission
Whenever v3d kernel-driver supports multisync extension, use it to
enable more than one semaphores in cl submission. In CL, we have two
kind of job (bin and render), therefore, we need also to determine
the stage to sync, that means to add job dependencies/wait
semaphores.

Also, as we currently process all signal semaphores of a cmd buffer
batch together in the submit master thread (when the last wait
thread completes), there isn't now a situation in which GPU jobs
need to handle signal VkSemaphores.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Melissa Wen 85c49db10d v3dv: check multiple semaphores capability
Check if kernel-driver supports multisync extension

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Melissa Wen d5bd18fbb3 v3dv: store wait semaphores in event_wait_cpu_job_info
Instead of a boolean (sem_wait) in v3dv_event_wait_cpu_job_info,
that is used to determine wait condition for jobs put in a wait
thread, copy the wait semaphores data and store it as struct
v3dv_submit_info_semaphores. In the following patches we enable
multiple semaphores in GPU jobs, and therefore we need this data
to add wait semaphores as job dependencies for pending jobs
submitted from a wait thread.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Melissa Wen d148379edf v3dv: wrap wait semaphores info in v3dv_submit_info_semaphores
Instead of pass pSubmit to queue_submit_cmd_buffer, create a struct
v3dv_submit_info_semaphores to wrap semaphores data from VkSubmitInfo.
In the next commit, this struct will help to handle wait condition for
jobs submitted in a wait event context, since we need to hold this
data when handle wait events and pass it to queue_submit_job() called
from wait threads. The main goal is to allow multiple wait semaphores
in a job submission. Later, this struct will be extended to include a
copy of signal semaphores too.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Melissa Wen 09991fc47b v3dv: drop unused variable on handle_set_event_cpu_job
is_wait_thread is passed, but not actually used; and cpu_queue_handle_idle
is in charge to handle wait threads spawned before this one.

Signed-off-by: Melissa Wen <mwen@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>
2022-01-21 10:59:17 +00:00
Charles Giessen 6ea7a61d7a v3dv: Update LoaderICDInterfaceVersion to v5
With the proper version checking in the common vulkan instance code
(commit 88b9b68) it is now possible to bring the reported interface
version up to v5.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14563>
2022-01-20 07:25:07 +00:00
Dave Airlie ccbf700d6c nir: remove gl.h include from nir headers.
This saves a lot of pointless gl.h includes across the board,
it moves the one place that needs GLenum into a separate file
only used in those passes that require it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
2022-01-19 21:54:58 +00:00
Dave Airlie 1352e0ba0c mesa/*: add a shader primitive type to get away from GL types.
This creates an internal shader_prim enum, I've fixed up most
users to use it instead of GL types.

don't store the enum in shader_info as it changes size, and confuses
other things.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
2022-01-19 21:54:58 +00:00
Chia-I Wu 37fa59fa6c anv,lavapipe,v3dv: use wsi_common_get_image
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (anv)
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v3dv)
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> (lavapipe)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14544>
2022-01-14 17:41:42 +00:00
Danylo Piliaiev fa75b2a027 vulkan/wsi: create a common function to compare drm devices
Effectively moves most of v3dv_wsi_can_present_on_device to the
common code to be used in other drivers.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11091>
2022-01-14 12:19:57 +00:00
Iago Toral Quiroga b9f9474577 v3dv: implement double-buffer mode
Double buffer mode splits the tile buffer size in half so we can
start processing the next tile while the current one is being
stored to memory. This mode is available only if MSAA is not enabled
and can, in theory, improve performance by reducing tile store
overhead, however, it comes at the cost of reducing the tile size,
which also causes some overhead of its own.

Testing shows that this helps some cases (i.e the Vulkan Quake
ports) but hurts others (i.e. Unreal Engine 4), so for the time
being we don't enable this by default but we allow to enable it
selectively by using V3D_DEBUG.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14551>
2022-01-14 10:57:26 +00:00
Iago Toral Quiroga fbe4d7ccf4 v3dv: implement VK_EXT_4444_formats
Because these formats are introduced trough an extension, their
enum values are exceedingly large and we cannot use them to index
directly into the format table we had for core formats. Instead,
we put these in a separate table and we always use the
VK_ENUM_OFFSET helper to index into these tables.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14533>
2022-01-14 10:10:10 +00:00
Iago Toral Quiroga 25c46c465d v3dv: handle formats with reverse flag
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14533>
2022-01-14 10:10:10 +00:00
Iago Toral Quiroga 872f08815b v3dv: add swizzle helpers to identify formats wit R/B swap and reverse flags
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14533>
2022-01-14 10:10:10 +00:00
Charles Giessen 5a37cc1186 v3dv: Update LoaderICDInterfaceVersion to v4
vk_icdNegotiateLoaderICDInterfaceVersion now correctly identifies the
driver as supporting v4. Before, the driver did support the
functionality but didn't report supporting it through the negotiate
function.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14299>
2022-01-14 10:25:50 +01:00
Alejandro Piñeiro d7cbe17760 v3dv: simplify v3dv_debug_ignored_stype
mesa_logd already handles having or not DEBUG defined, and also has a
better empty option.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14512>
2022-01-13 13:18:05 +00:00
Christian Gmeiner 41179b665b broadcom/ci: use .test-manual-mr
Allow the jobs to be available for MRs.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14361>
2022-01-12 23:19:22 +00:00
Christian Gmeiner 6e08d8fc3d ci: Uprev piglit to af1785f31
Brings in these changes:

af1785f31 occlusion_query_conform: skip GetQueryCounterBits test if needed
dad078717 occlusion_query_conform: convert to pilgit subtests
b52c1c761 glsl-1.30: test nested preprocessor concat
6c4da153b texture-storage: Fix subtest result handling of skips.
4343f19db fbo-integer: Remove the invalid DrawPixels test.
e3842f2fe arb_dsa: exclude stencil8 textures from test sets.
ce8649be7 spec/ext_external_objects: Fix build on Debian systems
4e553838f glsl: add basic tests for desktop GLSL invariant qualifier linking
7e61e5199 Tests for variable in and out of loop scope
f855ad1c8 fbo-mrt-alphatest: Only require GLSL 1.20
9be2fe999 glx: add glx-multi-display-single-pbuffer test
bfe290725 glx: add glx-swap-pbuffer test
efa64335e framework: Fix build on Windows when using waffle

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14468>
2022-01-10 21:52:42 +00:00
Jason Ekstrand ca1d0333db v3dv: Use the common QueueSignalReleaseImageANDROID from RADV
This is an actual functional change as we now plumb through the sync FD
instead of doing a vkQueueSubmit and trusting in implicit sync.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14372>
2022-01-05 16:36:10 +00:00
Jason Ekstrand dfb1e1777c anv,radv,v3dv: Move AcquireImageANDROID to common code
All three implementations are identical.

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Tested-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14372>
2022-01-05 16:36:10 +00:00
Iago Toral Quiroga 44fa8304d4 v3dv: add a refcount mechanism to BOs
Until now we have lived without a refcount mechanism in the driver
because in Vulkan the user is responsible for handling the life
span of memory allocations for all Vulkan objects, however,
imported BOs are tricky because the kernel doesn't refcount
so user-space needs to make sure that:

1. When importing a BO into the same device used to create it
   (self-importing) it does not double free the same BO.
2. Frees imported BOs that were not allocated through the same
   device.

Our initial implementation always freed BOs when requested,
so we handled 2) correctly but not 1) and we would double-free
self-imported BOs. We tried to fix that in commit d809d9f3
but that broke 2) and we started to leak BOs for some imports.

This fixes the problem for good by adding refcounts to BOs
so that self-imported BOs have a refcnt > 1 and are only freed
when all references are freed.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5769
Tested-by: Roman Stratiienko <r.stratiienko@gmail.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14392>
2022-01-05 12:22:45 +00:00
Thomas H.P. Andersen c32c9014f5 broadcom/compiler: fix compile warning -Wabsolute-value
fixes a compile warning with clang

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14302>
2022-01-03 20:20:37 +00:00
Roman Stratiienko 2686c5419d v3dv: add Android support
Acknowledgements to android-rpi team and lineage-rpi maintainer (KonstaT)
for creating/testing initial vulkan support. Their experience was used as
a baseline for this work.

Most of the code is a copy of turnip and anv.
Improved by cleaning dEQP failures:

 - Improved gralloc support (use allocation time stride, size, modifier).
 - Fixed some dEQP crashes due to memory allocation issues.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14016>
2021-12-21 09:24:43 +00:00
Juan A. Suarez Romero bc11dc7187 broadcom/ci: restructure expected results
Sort/rename the files so expected tests are classified by device.

No need to split the tests by driver (e.g., V3D vs V3DV).

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13983>
2021-12-17 09:15:34 +00:00
Pierre-Eric Pelloux-Prayer 1cb5c1775b glx: fix querying GLX_FBCONFIG_ID for Window
This commit fixes apps using the following sequence:
1. XCreateWindow(dpy) -> win
2. glXCreateContextAttribsARB(dpy, ...) -> ctx
3. glXMakeCurrent(dpy, win, ctx)
4. glXQueryDrawable(dpy, win, GLX_FBCONFIG_ID, ...)

glXQueryDrawable returned 0 (while correctly returning a valid
GLXFCONFIG_ID for other types of drawables).

This commit adds the same dance as driInferDrawableConfig to get
the GLX visual from the Window, and then the GLXFBCONFIG_ID of
this visual.

This fixes:
* piglit: glx-query-drawable --attr=GLX_FBCONFIG_ID --type=WINDOW
* Maya which uses the config ID from step 4 as an input to
glXChooseFBConfig.

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14174>
2021-12-16 01:21:36 +00:00
Alejandro Piñeiro 1c4f76672d broadcom/compiler: avoid unneeded sint/unorm clamping when lowering stores
They are being used on integer to integer stores. From Vulkan sec,
final paragraph of 16.4.4 "Texel Output Format Conversion":
    "Each component is converted based on its type and size (as
     defined in the Format Definition section for each
     VkFormat). ... Integer outputs are converted such that their value
     is preserved. The converted value of any integer that cannot be
     represented in the target format is undefined."

I didn't find a equivalent quote for OpenGL as all conversion entries
are forcused on float to integer, fixed-point to integer, etc, and not
on integer to integer. Didn't find any test failure with this change.

We didn't get any shader-db stats change with shaderdb (even
overriding to OpenGL 4.4 to get more shaders built), so as a reference
Vulkan shader-db stats with the pattern
dEQP-VK.image.*.with_format.*.*
   total instructions in shared programs: 37534 -> 36522 (-2.70%)
   instructions in affected programs: 12080 -> 11068 (-8.38%)
   helped: 241
   HURT: 0
   Instructions are helped.

   total uniforms in shared programs: 9100 -> 8550 (-6.04%)
   uniforms in affected programs: 3004 -> 2454 (-18.31%)
   helped: 229
   HURT: 0

   total max-temps in shared programs: 6110 -> 6014 (-1.57%)
   max-temps in affected programs: 402 -> 306 (-23.88%)
   helped: 43
   HURT: 0
   Max-temps are helped.

   total nops in shared programs: 1523 -> 1526 (0.20%)
   nops in affected programs: 21 -> 24 (14.29%)
   helped: 3
   HURT: 6
   Inconclusive result (value mean confidence interval includes 0).

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14194>
2021-12-15 11:53:20 +00:00
Roman Stratiienko 2cbbfd23ce v3dv: Hotfix: Rename remaining V3DV_HAS_SURFACE->V3DV_USE_WSI_PLATFORM
This was somehow missed by me and during review.

Fixes fcfc4ddfccd5: ("v3dv: Fix V3DV_HAS_SURFACE preprocessor condition")

Signed-off-by: Roman Stratiienko <roman.o.stratiienko@globallogic.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14190>
2021-12-14 10:33:28 +00:00
Iago Toral Quiroga 2630c8f546 broadcom/compiler: improve thrsw merge
Instead of stopping the merge process when we find an instruction
with an incompatible signal (such as an small immediate), keep
going and see if we can merge the thrsw in a previous instruction
that is compatible.

total instructions in shared programs: 13409835 -> 13356648 (-0.40%)
instructions in affected programs: 3556860 -> 3503673 (-1.50%)
helped: 17457
HURT: 18
Instructions are helped.

total max-temps in shared programs: 2353971 -> 2352956 (-0.04%)
max-temps in affected programs: 13960 -> 12945 (-7.27%)
helped: 703
HURT: 0
Max-temps are helped.

total spills in shared programs: 12301 -> 12301 (0.00%)
total sfu-stalls in shared programs: 32596 -> 32499 (-0.30%)
sfu-stalls in affected programs: 225 -> 128 (-43.11%)
helped: 79
HURT: 3
Sfu-stalls are helped.

total nops in shared programs: 347204 -> 325234 (-6.33%)
nops in affected programs: 99834 -> 77864 (-22.01%)
helped: 11515
HURT: 158
Nops are helped.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14172>
2021-12-14 09:50:17 +00:00
Roman Stratiienko fcfc4ddfcc v3dv: Fix V3DV_HAS_SURFACE preprocessor condition
Currently V3DV_HAS_SURFACE is always defined.
There is no WSI for Android in mesa3d, therefore WSI related extensions
should not be exposed.

1. Define V3DV_HAS_SURFACE only for platforms which has WSI implemented.
2. Rename V3DV_HAS_SURFACE -> V3DV_USE_WSI_PLATFORM to align naming
with other platforms.

Fixes dEQP-VK.wsi.android.surface#query_protected_capabilities

Fixes: 79e4451430 ("v3dv: move extensions table to v3dv_device")
Signed-off-by: Roman Stratiienko <roman.o.stratiienko@globallogic.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14144>
2021-12-13 07:11:20 +00:00
Khem Raj 249556dad8 v3dv: account for 64bit time_t on 32bit arches
This makes is a bit more portable, especially on 32bit architectures
with 64bit time_t defaults. Especially on musl its a must.

Fixes
../mesa-21.3.0/src/broadcom/vulkan/v3dv_bo.c:71:15: error: format specifies type 'long' but the argument has type 'time_t' (aka 'long long') [-Werror,-Wformat]
              time.tv_sec);
              ^~~~~~~~~~~

Also reported here [1]

[1] https://github.com/agherzan/meta-raspberrypi/issues/969

Signed-off-by: Khem Raj <raj.khem@gmail.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14118>
2021-12-10 22:26:56 +00:00
Roman Stratiienko 72db15913f v3dv: Fix dEQP-VK.info#instance_extensions test
When mesa3d is built without VK_USE_PLATFORM_DISPLAY_KHR definition,
dEQP test fails:

    dEQP    : Test case 'dEQP-VK.info.instance_extensions'..
    dEQP    :   Fail (Extension VK_KHR_get_display_properties2 is missing
                                                 dependency: VK_KHR_display)
    dEQP    : DONE!

Enable KHR_get_display_properties2 only if VK_USE_PLATFORM_DISPLAY_KHR
is enabled.

Fixes: f884c2e3be ("v3dv: implement VK_KHR_get_display_properties2")
Signed-off-by: Roman Stratiienko <roman.o.stratiienko@globallogic.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14047>
2021-12-08 21:34:47 +00:00
Juan A. Suarez Romero 11287475c8 v3d: enable ARB_texture_view
v2 (Iago):
 - Add comments to failing tests

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>
2021-12-03 15:32:36 +00:00
Alejandro Piñeiro 7f1525f086 v3d: enable ARB_texture_buffer_object and ARB_texture_buffer_range
Through their specific PIPE_CAP.

v2 (Iago)
 - Add comment in test failure

Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>
2021-12-03 15:32:36 +00:00
Juan A. Suarez Romero fd47c939f4 st/pbo: add the image format in the download FS
In the V3D driver there is a NIR lowering step for `image_store`
intrinsic, where the image store format is required for doing the proper
lowering.

Thus, let's define it for the download FS instead of
keeping it as NONE.

v2 (Illia)
 - Use format only for drivers not supporting format-less writing.

v4 (Illia):
 - Use PIPE_CAP_IMAGE_STORE_FORMATTED to reduce combinations.

v5 (Ilia):
 - Use indirect array for download FS in not formatless-store support
   drivers.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>
2021-12-03 15:32:36 +00:00
Iago Toral Quiroga cc7db1fc53 broadcom/compiler: improve documentation for Z writes
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>
2021-12-03 10:39:08 +00:00
Iago Toral Quiroga d7b79f3531 v3d,v3dv: don't disable EZ for passthrough Z writes
The early-Z test uses Z values produced from FEP, so when
we write Z from a shader we need to disable EZ. However, there
are some instances where want to write the FEP-Z from the shader,
in which case we would not need to disable EZ.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>
2021-12-03 10:39:08 +00:00
Iago Toral Quiroga a65c605365 broadcom/compiler: track passthrough Z writes
In some cases we need to make the shaders write the Z value produced
from rasterization (FEP). Track these instances because they are relevant
to early EZ setup.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>
2021-12-03 10:39:08 +00:00
Iago Toral Quiroga 6d4a645c90 broadcom/compiler: emit passthrough Z write if shader reads Z
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>
2021-12-03 10:39:08 +00:00
Qiang Yu fcc062235c ci: remove egl-copy-buffers from fail list
egl-copy-buffers test has been fixed for dri3. So remove
it from broadcom and freedreno ci fail list to prevent the
gitlab ci test fail:

  spec@egl 1.4@egl-copy-buffers,UnexpectedPass

Also remove it from radeonsi ci fail list since I verified
on radeonsi.

Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13868>
2021-11-30 01:58:42 +00:00
Ilia Mirkin e31d08d307 ci: move windowoverlap exclusion to all-skips
The test is just plain not built by our containers. Skip it everywhere.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13919>
2021-11-29 18:08:49 -05:00
Iago Toral Quiroga 996f147fef broadcom/compiler: relax restriction on VPM inst in last thread end slot
According to the documentation, only vpmwt is disallowed in the last delay
slot of the thread end.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13975>
2021-11-29 14:06:43 +00:00
Iago Toral Quiroga 6923dd687c broadcom/compiler: allow color TLB writes in last instruction
Only Z writes are disallowed.

total instructions in shared programs: 11578449 -> 11577369 (<.01%)
instructions in affected programs: 38132 -> 37052 (-2.83%)
helped: 1080
HURT: 0
Instructions are helped.

total max-temps in shared programs: 2334416 -> 2334395 (<.01%)
max-temps in affected programs: 218 -> 197 (-9.63%)
helped: 21
HURT: 0
Max-temps are helped.

total inst-and-stalls in shared programs: 11607890 -> 11606810 (<.01%)
inst-and-stalls in affected programs: 38265 -> 37185 (-2.82%)
helped: 1080
HURT: 0
Inst-and-stalls are helped.

total nops in shared programs: 338316 -> 337236 (-0.32%)
nops in affected programs: 2625 -> 1545 (-41.14%)
helped: 1080
HURT: 0
Nops are helped.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13964>
2021-11-29 06:44:07 +00:00
Alejandro Piñeiro a9b4aef0f2 broadcom/compiler: make shaderdb debug output compatible with shaderdb's report tool
Even although the option is called shaderdb, it is not really used by
shaderdb (for V3D shaderdb uses the debug option "precompile"). And in
fact, right now the output format is not compatible with shaderdb.

This commit tries to fix that, and as we are here, also try to make
the option more useful for the Vulkan case, as that debug option also
works with v3dv.

We can't really fully imitate shaderdb use with OpenGL (run with a set
of glsl shader tests), but we can at least assign a unique name (the
pipeline sha1 in text format) so we can compare executions of the same
vulkan application. For that remember to disable the on-disk cache.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13938>
2021-11-24 13:02:08 +00:00
Iago Toral Quiroga 79dee14cc2 broadcom/compiler: don't move ldvary earlier if current instruction has ldunif
If we did, we would have the instruction coming right after ldvary write
to the same implicit destination as ldvary at the same time. We prevent
this when merging instructions, but we should make sure we prevent this
when we move ldvary around for pipelining too.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13921>
2021-11-23 10:52:24 +00:00
Iago Toral Quiroga 7fec4f4135 broadcom/compiler: fix scoreboard locking checks
According to the spec the hardware locks the scoreboard on the first
or last thread switch (selected via shader state) and any TLB accesses
executed before this are not synchronized by hardware.

This change updates the logic to ensure we respect this requirement
and that we don't assume that the lock is acquired automatically
on the first TLB access, which is not valid at least since V3D 4.1+.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13910>
2021-11-22 12:53:43 +00:00
Iago Toral Quiroga bd7584c16b broadcom/compiler: don't allow RF writes from signals after thrend
Writes to physical registers are not allowed after thread end. We
were checking this for ALU writes, but we need to check it for
signal writes too.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13910>
2021-11-22 12:53:43 +00:00
Juan A. Suarez Romero 457dbb81f5 broadcom/compiler: apply constant folding on early GS lowering
This solves a case where a NIR geometry shader was storing the output in
a non-constant:

  vec4 32 ssa_1 = load_const (0xc0800000 /* -4.000000 */, 0xc1100000 /* -9.000000 */, 0x40400000 /* 3.000000 */, 0x40e00000 /* 7.000000 */)
  vec1 32 ssa_7 = load_const (0x00000000 /* 0.000000 */)
  vec1 32 ssa_8 = load_const (0x00000001 /* 0.000000 */)
  vec1 32 ssa_9 = iadd ssa_7, ssa_8
  vec1 32 ssa_19 = mov ssa_1.x
  intrinsic store_output (ssa_19, ssa_9) (1, 1, 0, 160, 288) /* base=1 */ /* wrmask=x */ /* component=0 */ /* src_type=float32 */ /* location=32 slots=2 gs_streams(x=0 y=0 z=0 w=0) */

When lowering the VPM output we check if the destination (ssa_9 in this
case) is a constant to add to the VPM offset. We run a constant folding
optimization in an earlier VS lowering, and we should do the same for
GS.

This fixes multiple dEQP-VK.pipeline.interface_matching.* failures.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13884>
2021-11-22 09:32:50 +00:00
Juan A. Suarez Romero 7b21635057 broadcom/compiler: handle array of structs in GS/FS inputs
While fragment and geometry shader were handling structs as inputs, they
weren't doing for it arrays of structures.

This fixes multiple dEQP-VK.pipeline.interface_matching.* failures and
assertions.

v2:
 - Fix style (Iago).

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13884>
2021-11-22 09:32:50 +00:00
Alejandro Piñeiro ff89dc3523 vulkan: move common format helpers to vk_format
v3dv, radv, and turnip are using several C&P format helpers (most of
them wrappers over util_format_description based helpers).  methods.

This commit moves the common helpers to the already existing common
vk_format.h. For the case of v3dv we were able to remove the vk_format
header. For turnip and radv, a local vk_format.h header remains, with
methods that are only used for those drivers.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13858>
2021-11-19 12:23:19 +01:00
Iago Toral Quiroga 5e536c97a9 broadcom/compiler: fix early fragment tests setup
When early fragment tests are mandated by the shader, we must use
the Z value produced by the FEP even if there are elements that
would typically require late fragment tests (such as discards,
sample to coverage, etc).

This change means we also need to be a bit more careful when
we promote shaders to use early fragment tests so we don't
promote anything with discards for example.

Fixes:
dEQP-VK.fragment_operations.early_fragment.discard_early_fragment_tests_depth
dEQP-VK.fragment_operations.early_fragment.discard_early_fragment_tests_stencil

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13837>
2021-11-18 07:39:32 +00:00
Connor Abbott 508f917d8c util/dag: Make edge data a uintptr_t
Nobody was actually using it as a pointer, and I'm going to introduce a
shared function which relies on it not being a pointer so let's fix this
once and for all.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>
2021-11-17 13:41:47 +00:00
Alejandro Piñeiro cbf0d83eac v3d,v3dv: move TFU register definition to a common header
We are using the same definitions for both OpenGL and Vulkan, so let's
move it to common.

As we are here we are also adding versioning on the TFU register
definition. Those are basically register bit places, so really likely
to change between versions.

Adding 33 as it is the first version they got defined.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13832>
2021-11-17 11:04:31 +01:00
Iago Toral Quiroga 836a4b5836 v3dv: fix internal bpp of D/S formats
Depth/stencil formats can, at worse (d32/d24s8), be exactly 32bpp,
which is the minimum we can program for the internal format.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13816>
2021-11-17 06:57:48 +00:00
Iago Toral Quiroga f384c763fc v3d,v3dv: move tile size calculation to a common helper
We had this code replicated in 3 places across both drivers.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13790>
2021-11-15 11:40:39 +00:00
Iago Toral Quiroga 7490bcad37 v3dv: don't use a global constant for default pipeline dynamic state
Some of these may change across V3D versions, so it is not practical.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13775>
2021-11-12 11:04:07 +00:00
Iago Toral Quiroga 4b3931ee6c v3dv: account for multisampling when computing subpass granularity
The granularity is defined by the tile size, which is also determined
by multisampling.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13775>
2021-11-12 11:04:07 +00:00
Iago Toral Quiroga 0cb58f80d2 v3d: use V3D_MAX_DRAW_BUFFERS instead of hardcoded constant
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13775>
2021-11-12 11:04:07 +00:00
Christian Gmeiner a0634a3c85 ci/bare-metal: switch to common .baremetal-test-arm64
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13751>
2021-11-12 08:22:29 +00:00
Christian Gmeiner 8bc284fe5b ci/bare-metal: armhf: move BM_ROOTFS to generic place
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13751>
2021-11-12 08:22:29 +00:00
Emma Anholt a68a0c9e1c mesa/st: Disable NV_copy_depth_to_color on non-doubles-capable HW.
The previous doubles check
(https://gitlab.freedesktop.org/mesa/mesa/-/issues/3459) checked that you
didn't have full doubles emulation turned on, but we also need to check
that you have doubles at all (emulated or not) or non-GL4 drivers will
fail.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13743>
2021-11-11 16:38:58 +00:00
Neil Roberts bdaf185889 v3d: Update prim_counts when prims generated query in flight without TF
In order to implement GL_PRIMITIVES_GENERATED, v3d allocates a small
resource and adds a command to the job to store the prim counts to it.
However it was only doing this when TF was enabled which meant that if
the query was used with a geometry shader but no TF then the query would
always be zero. This patch makes the driver keep track of how many
PRIMITIVES_GENERATED queries are in flight and then enable writing the
prim count if its more than zero.

Fix dEQP-GLES31.functional.geometry_shading.query.primitives_generated_*

v2: Update CI expectations and references to fixed tests in commit log.
v3: - Add comment that GL_PRIMITIVES_GENERATED query is included because
      OES_geometry_shader, but it is not part of OpenGL ES 3.1. (Iago)
    - Update Fixes to commit introducing geometry shaders. (Iago)

Fixes: a1b7c084 ("v3d: fix primitive queries for geometry shaders")
Signed-off-by: Neil Roberts <nroberts@igalia.com>
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13712>
2021-11-11 08:02:04 +00:00
Iago Toral Quiroga 3a95e25e84 v3dv,v3d: don't store swizzle pointer in shader/pipeline keys
We had been storing pointers to a driver owned swizzle table
rather than storing the actual swizzle value in various shader
and pipeline keys on both GL and Vulkan drivers.

This doesn't look very robust, particularly since we also
compute sha1 hashes from these values and we may store these
hashes to disk (for the disk cache).

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13738>
2021-11-10 11:24:26 +00:00
Iago Toral Quiroga aa5a0e1dad broadcom/compiler: copy packing when converting add to mul
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13675>
2021-11-04 13:57:39 +00:00
Iago Toral Quiroga a794bdf953 broadcom/compiler: check that sig packing is valid when pipelining ldvary
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13641>
2021-11-03 10:49:06 +00:00
Emma Anholt 4e28962800 ci: Uprev VK-GL-CTS to 1.2.7.2, and pull in piglit while I'm here.
The VK-GL-CTS fixes some issues for freedreno, and almost all of LVP's
xfails.

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13622>
2021-11-02 20:29:31 +00:00
Iago Toral Quiroga 6b9bd3f038 broadcom/compiler: make opt passes set current block
Typically, optimization passes go through all the blocks in a shader
and make adjustments on the fly, so we always want them to update
the current block or the current block pointer will become outdated.

Also, we don't need to keep track of the previous current block
pointer to restore it, since optimization passes run after we have
completed conversion to VIR, and therefore, anything that comes after
that should always set the current block before emitting code.

Fixes debug assert crashes when running shader-db:
vir.c:1888: try_opt_ldunif: Assertion `found || &c->cur_block->instructions == c->cursor.link' failed

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13625>
2021-11-02 11:17:01 +00:00
Ella Stanforth 3c86292321 v3dv: Implement VK_KHR_create_renderpass2
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13575>
2021-11-02 07:43:31 +00:00
Jason Ekstrand 4108fda426 vulkan: Move all the common object code to runtime/
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13156>
2021-10-29 23:12:32 +00:00
Jason Ekstrand 5ccba1576d v3dv: Use vk_instance_get_proc_addr_unchecked for WSI
It exists precisely to handle this case without the driver looking up
trampolines itself.  This is nearly identical to what ANV does.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13156>
2021-10-29 23:12:32 +00:00
Iago Toral Quiroga b42f4b8809 broadcom/compiler: padding fixes to QPU assembly dumps
When there are dst/src modifiers it is pretty common that instructions
take too much space and lead to alignment issues that make code a lot
harder to read, so align the MUL and SIG columns a bit wider to avoid
this:

Before:

0x380021828003faa8 fmax  rf2, rf42.abs, rf40.abs; nop
0x3800f186c503f0f0 fcmp.pushc  -, rf3, rf48; nop
0x380c038b85b83282 fmax  rf11, rf10, rf2; mov.ifa  rf14, rf46
0x3800219ab503f359 and  rf26, rf13, rf25; nop
0x3820f186c503f2f0 fcmp.pushc  -, rf11, rf48; nop           ; thrsw
0x382c013fb5b8368e and  rf63, rf26, rf14; mov.ifa  rf4, rf46; thrsw
0x38002185b503ffc4 and  rf5, rf63, rf4  ; nop
0x38002186b503f141 and  rf6, rf5, rf1   ; nop
0x382031873503f186 vfpack  tlb, rf6, rf6; nop               ; thrsw
0x380031873503f18f vfpack  tlb, rf6, rf15; nop
0x38003186bb03f000 nop                  ; nop

After:

0x380021828003faa8 fmax rf2, rf42.abs, rf40.abs  ; nop
0x3800f186c503f0f0 fcmp.pushc -, rf3, rf48       ; nop
0x380c038b85b83282 fmax rf11, rf10, rf2          ; mov.ifa rf14, rf46
0x3800219ab503f359 and rf26, rf13, rf25          ; nop
0x3820f186c503f2f0 fcmp.pushc -, rf11, rf48      ; nop                         ; thrsw
0x382c013fb5b8368e and rf63, rf26, rf14          ; mov.ifa rf4, rf46           ; thrsw
0x38002185b503ffc4 and rf5, rf63, rf4            ; nop
0x38002186b503f141 and rf6, rf5, rf1             ; nop
0x382031873503f186 vfpack tlb, rf6, rf6          ; nop                         ; thrsw
0x380031873503f18f vfpack tlb, rf6, rf15         ; nop
0x38003186bb03f000 nop                           ; nop

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13545>
2021-10-28 08:12:14 +00:00
Emma Anholt bfbc41a9fa ci/piglit-runner: Merge piglit-driver-*.txt files into driver-*.txt.
The test names are definitely unique (deqp has specific prefixes, piglit
uses '@' as a separator instead of '.'), so we can just have a single file
regardless of test type.  Merges the two groups of xfails together so you
can't mix up which file to edit (I certainly have), and so that we don't
need to introduce yet another set of files when we add gtest for libva.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13517>
2021-10-27 20:54:11 +00:00
Emma Anholt 38dff02bfb ci/deqp-runner: Rename the deqp-drivername-*.txt files to drivername-*.txt
We have two testsuites with the same format for fails/flakes/skips files,
and test names that are definitely unique.  As I'm about to add a third
testsuite (gtest for libva-utils), so let's have just one file each for
fails/flakes/skips instead of one per type of testsuite.  This starts the
move with just the bulk rename of deqp.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13517>
2021-10-27 20:54:11 +00:00
Iago Toral Quiroga 0a277fabce broadcom/compiler: fix condition encoding bug
When both AC and MC are set, AC is encoded in bits 0..1 not 0..3.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13527>
2021-10-27 06:03:12 +00:00
Iago Toral Quiroga 3fbd6662b7 broadcom/compiler: rework simultaneous peripheral access checks
This was not quite correct in that our checks for the allowed cases
were not checking that there were no other peripheral access other
than the ones allowed.

For example, we allowed  wrtmuc signal and TMU write other than
TMUC, and we also allowed TMU read and VPM read/write. But we
cannot allow wrtmuc with TMU write other than TMUC and at the
same time a VPM write for example, so we can't just check if we
have a combination of allowed peripherals, we still need to check
that those are the only ones in use by the combined instructions.

Another example is that even if we allow a TMU write (other than TMUC)
with a wrtmuc signal, the resulting instruction must still have just
one TMU write other than TMUC, but we were allowing the merge if one
instruction signaled wrtmuc and the other wrote to tmu other than tmuc
without testing if the combined result would have 2 tmu writes.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13527>
2021-10-27 06:03:12 +00:00
Iago Toral Quiroga ceaf56920c v3dv: refactor TFU jobs
We had an implementation for image copies and another for buffer to
image copies. Refactor the code so we have a single implementation
of this.

Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13481>
2021-10-22 11:05:33 +00:00