KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Jason Ekstrand	32527f3ccc	v3dv: Destroy the device mutex on the teardown path Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>	2022-04-13 17:22:14 +00:00
Jason Ekstrand	30191fd9df	v3dv: Don't use pthread functions on c11 mutexes This only works because c11/threads.h is typedeffing the c11 stuff to ptrheads. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>	2022-04-13 17:22:14 +00:00
Jason Ekstrand	25441b5e5c	v3dv: Put indirect compute CSD jobs in the job list Instead of having the CPU job execute the CSD job, put both jobs on the list with the CPU job first which modifies the GPU job which gets kicked off next. This gives the queue code more visibility into what types of jobs are actually in the list. In particular, if an indirect compute job is the last job in a batch buffer, it currently appears as if the batch ends with CPU work which isn't true because it kicks off GPU work. In that case, the last job on the list is now a GPU job, which better matches reality. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>	2022-04-13 17:22:14 +00:00
Jason Ekstrand	0208bb2d58	v3dv: Stop directly setting vk_device::alloc vk_device_init() will do this. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15704>	2022-04-13 17:22:14 +00:00
Timothy Arceri	20ab7046c0	glsl/st: use nir pass to lower indirect rather than GLSL IR Will allow us to drop more GLSL IR code in future once we switch all drivers to NIR. Also stops the need for all drivers to call this pass to remove indirect temps that may have been added during the NIR varying linking lowering/optimisations. This patch fixes some tests on i915, d3d12, lima and vc4. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15871>	2022-04-12 06:51:20 +00:00
Juan A. Suarez Romero	3ac7383843	ci: enable v3dv arm64 jobs This reverts commit `f567a832ee`. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15856>	2022-04-11 15:26:58 +00:00
Mike Blumenkrantz	f567a832ee	ci: disable v3dv arm64 jobs these have been broken for almost 48 hours Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15841>	2022-04-10 12:25:39 -04:00
Iago Toral Quiroga	40f0c616e8	v3dv: fix bogus VkDrmFormatModifierProperties2EXT usage The array is allocated for VkDrmFormatModifierPropertiesEXT, so writring entried with type VkDrmFormatModifierProperties2EXT is bogus. It seems this was a mistake added with a change intended to get rid of VK_OUTARRAY_MAKE, that changed the type of the write by mistake. Fixes: `56a2ccf058` ('v3dv: Stop using VK_OUTARRAY_MAKE()') Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15819>	2022-04-08 14:43:25 +00:00
Iago Toral Quiroga	cf4b3cb563	broadcom/compiler: prefer reconstruction over TMU spills when possible We have been reconstructing/rematerializing uniforms for a while, but we can do this in more scenarios, namely instructions which result is immutable along the execution of a shader across all channels. By doing this we gain the capacity to eliminate TMU spills which not only are slower, but can also make us drop to a fallback compilation strategy. Shader-db results show a small increase in instruction counts caused by us now being able to choose preferential compiler strategies that are intended to reduce TMU latency. In some cases, we are now also able to avoid dropping thread counts: total instructions in shared programs: 12658092 -> 12659245 (<.01%) instructions in affected programs: 75812 -> 76965 (1.52%) helped: 55 HURT: 107 total threads in shared programs: 416286 -> 416412 (0.03%) threads in affected programs: 126 -> 252 (100.00%) helped: 63 HURT: 0 total uniforms in shared programs: 3716916 -> 3716396 (-0.01%) uniforms in affected programs: 19327 -> 18807 (-2.69%) helped: 94 HURT: 50 total max-temps in shared programs: 2161796 -> 2161578 (-0.01%) max-temps in affected programs: 3961 -> 3743 (-5.50%) helped: 80 HURT: 24 total spills in shared programs: 3274 -> 3266 (-0.24%) spills in affected programs: 98 -> 90 (-8.16%) helped: 6 HURT: 0 total fills in shared programs: 4657 -> 4642 (-0.32%) fills in affected programs: 130 -> 115 (-11.54%) helped: 6 HURT: 0 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15710>	2022-04-08 05:37:28 +00:00
Jason Ekstrand	292ceb297c	v3dv: Enable VK_EXT_debug_utils It's implemented in common code as long as you use vk_command_buffer. Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15560>	2022-04-06 01:18:23 +00:00
Omar Akkila	4208895175	ci: bump VK-GL-CTS to 1.3.1.1 Signed-off-by: Omar Akkila <omar.akkila@collabora.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15668>	2022-04-04 23:04:33 +00:00
Iago Toral Quiroga	827ef5fba9	v3dv: fix limits for inline uniform blocks We don't support 'Update After Bind', however, the limits for this model also include the ones without it. See the with or without remark in the spec below: "maxPerStageDescriptorUpdateAfterBindInlineUniformBlocks is similar to maxPerStageDescriptorInlineUniformBlocks but counts descriptor bindings from descriptor sets created with or without the VK_DESCRIPTOR_SET_LAYOUT_CREATE_UPDATE_AFTER_BIND_POOL_BIT bit set." Fixes: dEQP-VK.api.info.vulkan1p2_limits_validation.ext_inline_uniform_block Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15732>	2022-04-04 09:28:55 +00:00
Iago Toral Quiroga	597560e27c	broadcom/compiler: always enable per-quad on spill operations This ensures that any channels used for helper invocations are also spilled/filled correctly. Alternatively, we could recursively track all temps that get involved in computing values that are then used in explicit (dfdx,dfdy) or implicit (texture coordinates for mipmap or anisotropic filtering, etc) derivatives, and only enable per-quad on these (or disable spilling of any of these values). Fixes: dEQP-VK.graphicsfuzz.cov-dfdx-dfdy-after-nested-loops Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15705>	2022-04-01 08:53:50 +00:00
Jason Ekstrand	688d478045	v3dv/queue: Rework multisync_free Thix fixes two bugs. First, we stop leaking in/out fences with multisync. Because the in_syncs and out_syncs parameters to set_multisync were arrays and not pointers to arrays, the caller's in_syncs and out_syncs pointers never got set and remained NULL so multisync_free() always sees to NULL pointers and does nothing, leaking both arrays. Not sure how this isn't showing up in the dEQP leak check tests. Second, the struct drm_v3d_multi_sync was in the scope of the then clause of the `if (device->pdevice->caps.multisync)` so it goes out of scope before the ioctl. This is, effectively, a use-after-free and, depending on stack allocation details, may result in the multisync extension struct getting stompped before the ioctl. Fixes: `ff8586c345` ("v3dv: enable multiple semaphores on cl submission") Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15512>	2022-03-29 14:38:41 +00:00
Iago Toral Quiroga	7f6ecb8667	v3dv: add reference counting for descriptor set layouts The spec states that descriptor set layouts can be destroyed almost at any time: "VkDescriptorSetLayout objects may be accessed by commands that operate on descriptor sets allocated using that layout, and those descriptor sets must not be updated with vkUpdateDescriptorSets after the descriptor set layout has been destroyed. Otherwise, descriptor set layouts can be destroyed any time they are not in use by an API command." Based on a similar fix for RADV. Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5893 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15634>	2022-03-29 11:28:39 +00:00
Iago Toral Quiroga	ca861bd6f4	v3dv: drop unnecessary memset We are already zeroing when we allocate the descriptor set layout memory with vk_object_zalloc. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15634>	2022-03-29 11:28:39 +00:00
Iago Toral Quiroga	591eed30b2	v3dv: fix sampler array addressing in v3dv_descriptor_set_layout Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15634>	2022-03-29 11:28:39 +00:00
Alejandro Piñeiro	81039feda4	broadcom: update language on V3D_DEBUG options Some typos, and bad grammar. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15593>	2022-03-28 19:21:48 +00:00
Iago Toral Quiroga	ce849032a4	broadcom/compiler: allow ldunifa with indirect uniform loads We handle uniforms by copying them into the uniform stream to be consumed with ldunif when they have a constant offset. Otherwise we fallback to general TMU access, which has more latency. However, just like we did for UBOs and read-only SSBOs, we can also try to use the unifa mechanism to handle indirect accesses in certain cases instead of the TMU fallback. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15575>	2022-03-28 10:44:13 +00:00
Iago Toral Quiroga	ea3223e7a4	v3dv: implement VK_EXT_inline_uniform_block Inline uniform blocks store their contents in pool memory rather than a separate buffer, and are intended to provide a way in which some platforms may provide more efficient access to the uniform data, similar to push constants but with more flexible size constraints. We implement these in a similar way as push constants: for constant access we copy the data in the uniform stream (using the new QUNIFORM_UNIFORM_UBO_*) enums to identify the inline buffer from which we need to copy and for indirect access we fallback to regular UBO access. Because at NIR level there is no distinction between inline and regular UBOs and the compiler isn't aware of Vulkan descriptor sets, we use the UBO index on UBO load intrinsics to identify inline UBOs, just like we do for push constants. Particularly, we reserve indices 1..MAX_INLINE_UNIFORM_BUFFERS for this, however, unlike push constants, inline buffers are accessed through descriptor sets, and therefore we need to make sure they are located in the first slots of the UBO descriptor map. This means we store them in the first MAX_INLINE_UNIFORM_BUFFERS slots of the map, with regular UBOs always coming after these slots. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15575>	2022-03-28 10:44:13 +00:00
Boris Brezillon	56a2ccf058	v3dv: Stop using VK_OUTARRAY_MAKE() We're trying to replace VK_OUTARRAY_MAKE() by VK_OUTARRAY_MAKE_TYPED() so people don't get tempted to use it and make things incompatible with MSVC (which doesn't support typeof()). Suggested-by: Daniel Stone <daniels@collabora.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15522>	2022-03-25 11:00:02 +00:00
Jason Ekstrand	19f56e3fc4	v3dv: Drop GetPhysicalDeviceQueueFamilyProperties Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15459>	2022-03-18 11:19:14 -05:00
Iago Toral Quiroga	4f284254e4	v3dv: support importing external semaphores This was waiting for multisync support in our kernel interface so we can wait on the actual imported payload of a semaphore rather than the last job we submitted. Reviewed-by: Melissa Wen <mwen@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>	2022-03-18 13:17:58 +00:00
Iago Toral Quiroga	fa1b10f36d	v3dv: lock around noop job submits Any thread we create may end up creating/submitting at least a noop job, which is a shared object. Before multisync, this was an issue only for the creation of the job itself, but with multisync we can also modify parameters of the noop job every time it is used (for signaling and serialization configuration). This change adds a noop mutex that all threads (main, wait and master) take before submitting a noop job to ensure concurrent access is not an issue. Fixes flakyness observed with multisync with the following test: dEQP-VK.api.command_buffers.secondary_execute_twice Reviewed-by: Melissa Wen <mwen@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>	2022-03-18 13:17:58 +00:00
Iago Toral Quiroga	daa865fb2c	v3dv: fix semaphore wait from CPU job If a CPU job comes first in a command buffer with a semaphore wait operation we need to wait on the CPU for the semaphore to be signaled before we process the job. We have been doing this with a WaitForIdle operation, but that only works if the semaphore has been submitted for signaling from the same instance of the driver. If we have an imported payload from another instance in our semaphore however, waitForIdle may return too early since the submission to signal the semaphore may have been submitted by a different instance of the driver as well, and our wait for idle checks only know about this instance submissions. To fix this, we always submit a noop job from our instance that waits on the semaphores on the GPU and follow up with WaitForIdle to wait for that to complete. Fixes test failures and/or assert crashes in: dEQP-VK.synchronization.cross_instance.* (when enabling support for semaphore imports) Reviewed-by: Melissa Wen <mwen@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>	2022-03-18 13:17:58 +00:00
Iago Toral Quiroga	3b8ab8a9ce	v3dv: don't signal semaphores/fences from a wait thread When we have a wait thread we can't ensure that the last job in the last command buffer will be the one to signal semaphores because in this case there is no gurantee that jobs from command buffers in the batch will be submitted to the GPU in order, as those put in a wait thread will be submitted later when the event wait operation is completed. Instead, we need to wait for all outstanding wait threads to complete and only then we should signal any semaphores or fences. This also fixes a bug where the wait for events was the last job in the command buffer. In this case, once the event wait is completed we have no additional jobs to submit and thus would never try to signal semaphores or fences. Reviewed-by: Melissa Wen <mwen@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>	2022-03-18 13:17:58 +00:00
Iago Toral Quiroga	03840bfcd1	v3dv: fix temporary imports of semaphores and fences with multisync This is preparatory work to expose support for importing semaphores, which was waiting on kernel multisync support. When we implemented user-space multisync support we didn't handle temporary fence/semaphore payload imports at all, so we fix that here. Also, we add a has_temp boolean flag to identify the case where we have a temporary payload in a fence/sempahore instead of just checking if temp_sync is not 0. This is necessary to support semaphore imports (for which we are not exposing support yet) because these need to drop the temporary payload when they are used as wait semaphores in a submit, but we can't destroy the underlying temp_sync at that point because it needs to survive at least until the submit is finished, so instead we use a flag to tell if we have an active temporary payload or not, and we simply destroy any temp_sync on a semaphore destroy or any new import on the same semaphore. We only strictly need this flag for semaphores because fences drop the temporary payload when they are reset, which happens in the CPU and can only be done if the GPU is not using the fence, but we add the same flag for the fence for consistency. Reviewed-by: Melissa Wen <mwen@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>	2022-03-18 13:17:58 +00:00
Iago Toral Quiroga	5a11a2fb6c	v3dv: don't expose image load/store features for linear images Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>	2022-03-18 13:17:58 +00:00
Iago Toral Quiroga	0590ce1362	v3dv: return early on image to buffer blit copies if image is linear This path uses a shader blit to implement the copy which is only supported for tiled images (except 1D). While blit_shader() already checks for this, this path does a lot of heavy lifting to prepare for the blit_shader call so we rather avoid that if possible when we know blit_shader won't be able to implement the blit. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>	2022-03-18 13:17:58 +00:00
Iago Toral Quiroga	397f4963ed	v3dv: TFU destination must be UIF We had some code that considered the possibility that the destination might be linear when configuring TFU jobs, but we never actually allow for this to happen since we avoid hitting these paths in that case, as the TFU always produces UIF results. Instead, add an assert when producing the TFU packet to ensure we are expecting a UIF result. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15342>	2022-03-18 13:17:58 +00:00
Alejandro Piñeiro	e3d905ec39	v3dv/pipeline: use new helper vk_shader_module_to_nir In addition to use the helper, we also remove some of the lowering we had at preprocess_nir, as they are called now by the helper. As we are here we also move the call to nir_lower_sysvals_to_varyings, that for some reason we were calling it before preprocess_nir. It is worth to note that with this change we lose the ability to debug the NIR just after spirv_to_nir using V3D_DEBUG, as now this is done on vk_spirv_to_nir, and as mentioned that includes several lowerings now. The workaround to that is to use NIR_DEBUG. We also needed to change how to check the entrypoint on the broadcom compiler, checking just if it is an entrypoint, instead of assuming that the name will be "main". v2: tweak comment, squash v3dv and compiler change (Iago) Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15449>	2022-03-18 11:05:11 +00:00
Juan A. Suarez Romero	730a294b90	v3dv: implement VK_EXT_line_rasterization Allow to choose the line rasterization algorithm. It supports rectangular and Bresenham-style line rasterization. v2 (Iago): - Update documentation. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15407>	2022-03-18 09:38:38 +00:00
Juan A. Suarez Romero	22759e9174	v3dv: add subpixel precision definition Move number of bits for subpixel precision in rasterizer to a define. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15407>	2022-03-18 09:38:38 +00:00
Juan A. Suarez Romero	b53dda6da8	broadcom: add line rasterization mode to packet definition Add the supported line rasterization modes as enums in the XML packet definition. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15407>	2022-03-18 09:38:38 +00:00
Juan A. Suarez Romero	102ae4bdc8	broadcom: add on-disk cache debug option Add support for`V3D_DEBUG=cache`, which prints on-disk cache events. v2: - Use same debug format for v3d and v3dv (Alejandro) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15380>	2022-03-18 08:58:01 +00:00
Iago Toral Quiroga	5c1302f47c	v3dv: expose VK_EXT_image_drm_format_modifier This has been implemented for a while but we could not expose it on Vulkan 1.0 because the extension declares a dependency on VK_KHR_sampler_ycbcr_conversion, which we don't implement, and CTS would complain. On Vulkan 1.1 however, VK_KHR_sampler_ycbcr_conversion was promoted to core as an optional feature, and this is enough for the the dependency to be satisfied, even if the feature is not supported, meaning that we can now expose the extension. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15426>	2022-03-18 06:42:06 +00:00
Juan A. Suarez Romero	c432bfe74b	broadcom/ci: Update flake list Some of the tests marked as flake didn't show up as flakes for a long time (more than 3 months). So likely they are already fixed. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Acked-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15411>	2022-03-17 13:56:41 +00:00
Juan A. Suarez Romero	dfb6438392	v3dv: change MESA_GLSL_CACHE envvar reference This was renamed to MESA_SHADER_CACHE. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15390>	2022-03-17 11:16:45 +01:00
Juan A. Suarez Romero	000b935c50	v3dv/ci: add test to skip list Add test that it is a timeout in the CI, but otherwise it passes. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15374>	2022-03-14 18:55:13 +00:00
Tapani Pälli	adea096029	ci: update various ci result files Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12936>	2022-03-11 09:58:28 +00:00
Iago Toral Quiroga	49b5431197	broadcom/compiler: remove unused functions Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15302>	2022-03-10 07:25:37 +00:00
Iago Toral Quiroga	44feff93c2	broadcom/compiler: don't always assign r5 if available Instead, only favor assigning r5 if we have first decided to assign an accumulator. This helps with assining r5 to short lived uniforms, favoring accumulator rotation to facilitate QPU merges. total instructions in shared programs: 12656164 -> 12628339 (-0.22%) instructions in affected programs: 5368373 -> 5340548 (-0.52%) helped: 17420 HURT: 9996 total uniforms in shared programs: 3704776 -> 3704863 (<.01%) uniforms in affected programs: 12247 -> 12334 (0.71%) helped: 23 HURT: 78 total max-temps in shared programs: 2153505 -> 2152684 (-0.04%) max-temps in affected programs: 26468 -> 25647 (-3.10%) helped: 569 HURT: 328 total fills in shared programs: 4656 -> 4657 (0.02%) fills in affected programs: 43 -> 44 (2.33%) helped: 0 HURT: 1 total sfu-stalls in shared programs: 34728 -> 34403 (-0.94%) sfu-stalls in affected programs: 3411 -> 3086 (-9.53%) helped: 842 HURT: 534 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>	2022-03-09 15:53:04 +00:00
Iago Toral Quiroga	77f58b46d9	broadcom/compiler: add comment on why we don't use r5 with ldunifa Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>	2022-03-09 15:53:04 +00:00
Iago Toral Quiroga	5b140428b0	broadcom/compiler: adjust register threshold for 2-thread compiles We have twice the registers in this case so it makes sense to double this as well. While this causes slight regressions in shader-db stats (due to additional register pressure), it helps us hide latency of memory reads better on 2-thread compiles, where the thread switch mechanism will be less effective. This shows a ~3% performance improvement on the UE4 SunTemple demo. total instructions in shared programs: 12642413 -> 12656164 (0.11%) instructions in affected programs: 2272652 -> 2286403 (0.61%) helped: 2924 HURT: 3389 total uniforms in shared programs: 3703861 -> 3704776 (0.02%) uniforms in affected programs: 213729 -> 214644 (0.43%) helped: 823 HURT: 1272 total max-temps in shared programs: 2150686 -> 2153505 (0.13%) max-temps in affected programs: 191332 -> 194151 (1.47%) helped: 1900 HURT: 1891 total spills in shared programs: 3255 -> 3274 (0.58%) spills in affected programs: 166 -> 185 (11.45%) helped: 3 HURT: 6 total fills in shared programs: 4630 -> 4656 (0.56%) fills in affected programs: 367 -> 393 (7.08%) helped: 7 HURT: 15 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>	2022-03-09 15:53:04 +00:00
Iago Toral Quiroga	a35b47a0b1	broadcom/compiler: add a strategy to disable scheduling of general TMU reads This can add quite a bit of register pressure so it makes sense to disable it to prevent us from dropping to 2 threads or increase spills: total instructions in shared programs: 12672813 -> 12642413 (-0.24%) instructions in affected programs: 256721 -> 226321 (-11.84%) helped: 719 HURT: 77 total threads in shared programs: 415534 -> 416322 (0.19%) threads in affected programs: 788 -> 1576 (100.00%) helped: 394 HURT: 0 total uniforms in shared programs: 3711370 -> 3703861 (-0.20%) uniforms in affected programs: 28859 -> 21350 (-26.02%) helped: 204 HURT: 455 total max-temps in shared programs: 2159439 -> 2150686 (-0.41%) max-temps in affected programs: 32945 -> 24192 (-26.57%) helped: 585 HURT: 47 total spills in shared programs: 5966 -> 3255 (-45.44%) spills in affected programs: 2933 -> 222 (-92.43%) helped: 192 HURT: 4 total fills in shared programs: 9328 -> 4630 (-50.36%) fills in affected programs: 5184 -> 486 (-90.62%) helped: 196 HURT: 0 Compared to the stats before adding scheduling of non-filtered memory reads we see we that we have now gotten back all that was lost and then some: total instructions in shared programs: 12663186 -> 12642413 (-0.16%) instructions in affected programs: 2051803 -> 2031030 (-1.01%) helped: 4885 HURT: 3338 total threads in shared programs: 415870 -> 416322 (0.11%) threads in affected programs: 896 -> 1348 (50.45%) helped: 300 HURT: 74 total uniforms in shared programs: 3711629 -> 3703861 (-0.21%) uniforms in affected programs: 158766 -> 150998 (-4.89%) helped: 1973 HURT: 499 total max-temps in shared programs: 2138857 -> 2150686 (0.55%) max-temps in affected programs: 177920 -> 189749 (6.65%) helped: 2666 HURT: 2035 total spills in shared programs: 3860 -> 3255 (-15.67%) spills in affected programs: 2653 -> 2048 (-22.80%) helped: 77 HURT: 21 total fills in shared programs: 5573 -> 4630 (-16.92%) fills in affected programs: 3839 -> 2896 (-24.56%) helped: 81 HURT: 15 total sfu-stalls in shared programs: 39583 -> 38154 (-3.61%) sfu-stalls in affected programs: 8993 -> 7564 (-15.89%) helped: 1808 HURT: 1038 total nops in shared programs: 324894 -> 323685 (-0.37%) nops in affected programs: 30362 -> 29153 (-3.98%) helped: 2513 HURT: 2077 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>	2022-03-09 15:53:04 +00:00
Iago Toral Quiroga	f783bd0d2a	broadcom/compiler: define v3d-specific delays for NIR instructions We do a few changes over NIR's defaults: 1. Lower delay for texture reads. Empirically, we don't observe any benefits with delays over 50 and since this delay value is still used by the scheduler in the "favor register pressure" case it is benefitial to avoid overestimating it too much. 2. Adjust delay for non-filtered TMU reads to the delay selected for texture reads. 3. In our case, UBO reads from dynamically uniform addresses don't use the TMU and have a latency of 1 instruction in the best case scenario or 4 at worse, so we go with 1 so we don't try to move this early. This helps us get back some of what we lost when updating the default scheduler configuration to add a delay for non-filtered memory reads: total instructions in shared programs: 13126587 -> 12671765 (-3.46%) instructions in affected programs: 3764097 -> 3309275 (-12.08%) helped: 14664 HURT: 4244 total threads in shared programs: 407208 -> 415522 (2.04%) threads in affected programs: 8716 -> 17030 (95.39%) helped: 4224 HURT: 67 total uniforms in shared programs: 3812698 -> 3711224 (-2.66%) uniforms in affected programs: 335170 -> 233696 (-30.28%) helped: 2816 HURT: 3551 total max-temps in shared programs: 2318430 -> 2159345 (-6.86%) max-temps in affected programs: 539991 -> 380906 (-29.46%) helped: 13173 HURT: 1440 total spills in shared programs: 49086 -> 5966 (-87.85%) spills in affected programs: 48306 -> 5186 (-89.26%) helped: 1655 HURT: 28 total fills in shared programs: 55810 -> 9328 (-83.29%) fills in affected programs: 54821 -> 8339 (-84.79%) helped: 1659 HURT: 22 LOST: 0 GAINED: 3 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>	2022-03-09 15:53:04 +00:00
Iago Toral Quiroga	e7a4e97076	nir/schedule: use larger delay for non-filtered memory reads This has been pending for a long time. It is not very consistent to add a significant delay for textures and not do it for UBOs, etc The reason we have not been doing this so far is the accumulated effect on register pressure for V3D as shown by shader-db results below, but from the point of view of a generic scheduler it makes sense to do this. Later patches will address V3D specific issues with register pressure derived from this by letting the driver control its instruction delay settings. total instructions in shared programs: 12662138 -> 13126587 (3.67%) instructions in affected programs: 1813091 -> 2277540 (25.62%) helped: 2410 HURT: 10499 total threads in shared programs: 415858 -> 407208 (-2.08%) threads in affected programs: 17348 -> 8698 (-49.86%) helped: 8 HURT: 4333 total uniforms in shared programs: 3711483 -> 3812698 (2.73%) uniforms in affected programs: 128012 -> 229227 (79.07%) helped: 3474 HURT: 2143 total max-temps in shared programs: 2138763 -> 2318430 (8.40%) max-temps in affected programs: 318780 -> 498447 (56.36%) helped: 588 HURT: 11997 total spills in shared programs: 3860 -> 49086 (1171.66%) spills in affected programs: 709 -> 45935 (6378.84%) helped: 23 HURT: 1595 total fills in shared programs: 5573 -> 55810 (901.44%) fills in affected programs: 1067 -> 51304 (4708.25%) helped: 23 HURT: 1595 LOST: 3 GAINED: 0 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>	2022-03-09 15:53:04 +00:00
Iago Toral Quiroga	9ef499b315	broadcom/compiler: stop moving UBO loads before NIR scheduling This doesn't have any significant impact shader-db stats and would reduce our capacity to hide latency from the loads, so it is probably undesirable: total instructions in shared programs: 12663189 -> 12663186 (<.01%) instructions in affected programs: 4222 -> 4219 (-0.07%) helped: 9 HURT: 4 total uniforms in shared programs: 3711624 -> 3711629 (<.01%) uniforms in affected programs: 186 -> 191 (2.69%) helped: 0 HURT: 2 total max-temps in shared programs: 2138822 -> 2138857 (<.01%) max-temps in affected programs: 569 -> 604 (6.15%) helped: 1 HURT: 9 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15276>	2022-03-09 15:53:03 +00:00
Timur Kristóf	64acec0ef9	nir: Fix lowering terminology of compute system values: "from"->"to". This is to match other NIR terminology. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15103>	2022-03-08 17:36:31 +00:00
Iago Toral Quiroga	f761f8fd9e	broadcom/compiler: simplify node/temp translation during register allocation Now that we don't sort our nodes we can arrange them so we can easily translate between nodes and temps without a mapping table, just applying an offset. To do this we have a single array of nodes where twe put first the nodes for accumulators and then the nodes for temps. With this setup we can ensure that for any given temp T, its node is always T + ACC_COUNT. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>	2022-03-02 08:09:11 +00:00
Iago Toral Quiroga	871b0a7f6a	broadcom/compiler: don't sort nodes for register allocation Nodes are allocated in order to registers so initially sorting was used to ensure that nodes with smaller life ranges would be assigned first and therefore be more likely to get accumulators. However, since `d81a6e5f1d` now we don't rely on order to make decisions about accumulators and instead we make policy decisions based on actual liveness, so sorting is no longer strictly relevant to this decision. Furthermore, we are not re-sorting nodes after each spill either, since that would probably require that we rebuild the interference graph after each spill (the graph identifies nodes by their index). Shader-db results show a significant improvement in instruction counts, due to more optimal accumulator assignments. The reason for this is that we use a round-robin policy for choosing the next accumulator to assign. The idea behind this is preventing nearby temps to be assigned to the same accumulator so that QPU scheduling is more flexible, but if we sort our nodes, we are basically not assigning temps in program order any more and the round-robin policy becomes less effective: total instructions in shared programs: 13000420 -> 12663189 (-2.59%) instructions in affected programs: 11791267 -> 11454036 (-2.86%) helped: 62890 HURT: 19987 total threads in shared programs: 415874 -> 415870 (<.01%) threads in affected programs: 20 -> 16 (-20.00%) helped: 2 HURT: 4 total uniforms in shared programs: 3711652 -> 3711624 (<.01%) uniforms in affected programs: 43430 -> 43402 (-0.06%) helped: 134 HURT: 173 total max-temps in shared programs: 2144876 -> 2138822 (-0.28%) max-temps in affected programs: 123334 -> 117280 (-4.91%) helped: 4112 HURT: 1195 total spills in shared programs: 3870 -> 3860 (-0.26%) spills in affected programs: 1013 -> 1003 (-0.99%) helped: 14 HURT: 12 total fills in shared programs: 5560 -> 5573 (0.23%) fills in affected programs: 1765 -> 1778 (0.74%) helped: 14 HURT: 17 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>	2022-03-02 08:09:11 +00:00
Iago Toral Quiroga	4483cd24af	broadcom/compiler: sink uniform loads total instructions in shared programs: 13014428 -> 13000420 (-0.11%) instructions in affected programs: 743624 -> 729616 (-1.88%) helped: 1392 HURT: 611 total threads in shared programs: 415858 -> 415874 (<.01%) threads in affected programs: 16 -> 32 (100.00%) helped: 8 HURT: 0 total uniforms in shared programs: 3720410 -> 3711652 (-0.24%) uniforms in affected programs: 113442 -> 104684 (-7.72%) helped: 635 HURT: 29 total max-temps in shared programs: 2154268 -> 2144876 (-0.44%) max-temps in affected programs: 61279 -> 51887 (-15.33%) helped: 1124 HURT: 187 total spills in shared programs: 4002 -> 3870 (-3.30%) spills in affected programs: 265 -> 133 (-49.81%) helped: 6 HURT: 0 total fills in shared programs: 5788 -> 5560 (-3.94%) fills in affected programs: 603 -> 375 (-37.81%) helped: 6 HURT: 0 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>	2022-03-02 08:09:11 +00:00
Iago Toral Quiroga	e228642cf5	broadcom/compiler: move constants before their first user For us they are basically uniforms too so we want to make their lifespans short to facilitate allocating them to accumulators. total instructions in shared programs: 13043585 -> 13015385 (-0.22%) instructions in affected programs: 8326040 -> 8297840 (-0.34%) helped: 24939 HURT: 19894 total threads in shared programs: 415860 -> 415858 (<.01%) threads in affected programs: 4 -> 2 (-50.00%) helped: 0 HURT: 1 total uniforms in shared programs: 3721953 -> 3720451 (-0.04%) uniforms in affected programs: 96134 -> 94632 (-1.56%) helped: 744 HURT: 435 total max-temps in shared programs: 2173431 -> 2154260 (-0.88%) max-temps in affected programs: 264598 -> 245427 (-7.25%) helped: 10858 HURT: 841 total spills in shared programs: 4005 -> 4010 (0.12%) spills in affected programs: 700 -> 705 (0.71%) helped: 5 HURT: 10 total fills in shared programs: 5801 -> 5817 (0.28%) fills in affected programs: 1346 -> 1362 (1.19%) helped: 6 HURT: 11 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>	2022-03-02 08:09:11 +00:00
Iago Toral Quiroga	a1998a9f43	broadcom/compiler: disallow TMU spills if max tmu spills is 0 If we are compiling with a strategy that does not allow TMU spills we should not allow spilling anything that is not a uniform. Otherwise the RA cost/benefit algorithm may choose to spill a temp that is not uniform and that will cause us to immediately fail the strategy and fallback to the next one, even if we could've instead chosen to spill more uniforms to compile the program successfully with that strategy. Some relevant shader-db stats: total instructions in shared programs: 13040711 -> 13043585 (0.02%) instructions in affected programs: 234238 -> 237112 (1.23%) helped: 73 HURT: 172 total threads in shared programs: 415664 -> 415860 (0.05%) threads in affected programs: 196 -> 392 (100.00%) helped: 98 HURT: 0 total uniforms in shared programs: 3717266 -> 3721953 (0.13%) uniforms in affected programs: 12831 -> 17518 (36.53%) helped: 6 HURT: 100 total max-temps in shared programs: 2174177 -> 2173431 (-0.03%) max-temps in affected programs: 4597 -> 3851 (-16.23%) helped: 79 HURT: 21 total spills in shared programs: 4010 -> 4005 (-0.12%) spills in affected programs: 55 -> 50 (-9.09%) helped: 5 HURT: 0 total fills in shared programs: 5820 -> 5801 (-0.33%) fills in affected programs: 186 -> 167 (-10.22%) helped: 5 HURT: 0 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>	2022-03-02 08:09:11 +00:00
Iago Toral Quiroga	cbb4d0dded	broadcom/compiler: increase cost of TMU spills to 10 Our cost was 5 which matches the number of instructions we have to add for a TMU spill (a fill is 4 instructions). Uniform spills on the other hand add an extra instruction for each fill and remove one instruction for the spill itself. These have a cost of 1. Therefore, if we have a single spill+fill, we end up with +9 instructions if it is a TMU spill and +0 instructions with a uniform spill, so making the former only 5 times more costly is probably not a good idea, and this is without even considering the added latency of the TMU accesses. Relevant shader-db changes show this causes as a marginal instruction count increase in a few shaders but better thread counts and lower TMU spilling overall: total instructions in shared programs: 13037315 -> 13040711 (0.03%) instructions in affected programs: 370106 -> 373502 (0.92%) helped: 187 HURT: 321 total threads in shared programs: 415090 -> 415664 (0.14%) threads in affected programs: 574 -> 1148 (100.00%) helped: 287 HURT: 0 total uniforms in shared programs: 3706674 -> 3717266 (0.29%) uniforms in affected programs: 63075 -> 73667 (16.79%) helped: 40 HURT: 395 total max-temps in shared programs: 2176080 -> 2174177 (-0.09%) max-temps in affected programs: 15838 -> 13935 (-12.02%) helped: 316 HURT: 34 total spills in shared programs: 4247 -> 4010 (-5.58%) spills in affected programs: 2599 -> 2362 (-9.12%) helped: 107 HURT: 14 total fills in shared programs: 6121 -> 5820 (-4.92%) fills in affected programs: 3622 -> 3321 (-8.31%) helped: 108 HURT: 13 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15168>	2022-03-02 08:09:11 +00:00
Guilherme Gallo	d1c6185b5a	ci: skqp: Add Vulkan support for a630_skqp job This commit adds support for Vulkan backend on a630_skqp job. = Needed changes - Needed to install libvulkan-dev package on system - Refactored the way the available skqp reports are printed tested in development builds with skia tools Piglit expectations had to be updated in various drivers due to !14750 not having bumped the tags when it tried to uprev. Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14686>	2022-02-25 05:50:06 +00:00
Iago Toral Quiroga	cf99584f51	broadcom/compiler: move uniforms right before their first use after scheduling On V3D the quality of the code we generate is significantly affected by how we decide to assign accumulators during register allocation, which is determined by liveness, favoring short-lived temps. There are many shaders that end up doing a whole lot of uniform loads first, and using them later, which is very inconvenient for our register allocation process because this increases uniform liveness and causes us to use accumulators less efficientely, leading to significant churn. To fix this, we move uniforms right before their first use in the same block, but we need to do this after NIR scheduling, which means we are doing it in non-SSA form, since the scheduler has a tendency to undo this optimization and it is not easy to modify it to avoid it, since it works in more abstract terms, using instruction dependencies, estimated register pressure and instruction delay information to do its work, which are very different concepts. total instructions in shared programs: 13316738 -> 13033613 (-2.13%) instructions in affected programs: 10389172 -> 10106047 (-2.73%) helped: 55442 HURT: 16144 total threads in shared programs: 413722 -> 415048 (0.32%) threads in affected programs: 1428 -> 2754 (92.86%) helped: 680 HURT: 17 total loops in shared programs: 1716 -> 1690 (-1.52%) loops in affected programs: 26 -> 0 helped: 26 HURT: 0 total uniforms in shared programs: 3704313 -> 3705181 (0.02%) uniforms in affected programs: 687730 -> 688598 (0.13%) helped: 2920 HURT: 7384 total max-temps in shared programs: 2364785 -> 2175190 (-8.02%) max-temps in affected programs: 1215387 -> 1025792 (-15.60%) helped: 49667 HURT: 1556 total spills in shared programs: 4241 -> 4248 (0.17%) spills in affected programs: 642 -> 649 (1.09%) helped: 11 HURT: 19 total fills in shared programs: 6115 -> 6125 (0.16%) fills in affected programs: 1276 -> 1286 (0.78%) helped: 11 HURT: 21 total sfu-stalls in shared programs: 34381 -> 36578 (6.39%) sfu-stalls in affected programs: 16055 -> 18252 (13.68%) helped: 3647 HURT: 5206 Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15056>	2022-02-24 11:36:00 +00:00
Iago Toral Quiroga	c4a78a2d2a	broadcom/compiler: fix register class patching for postponed spills If we have a postponed spill, the temp we create at ip is no longer the spilled temp and therefore is affected by the thrsw injection. Fixes corruption in the additive blending animation demo from Three.js. Fixes: `f3c3228522` ('broadcom/compiler: do not rebuild the interference graph after each spill') Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15112>	2022-02-22 11:17:10 +00:00
Erik Faye-Lund	25a37fabb7	vulkan/wsi: untangle buffer-images from prime Not all Vulkan implementations allows rendering to linear images, so in order to support scanning out from these on Windows we might have to copy through a buffer like we do in the PRIME path. To avoid reimplementing the same, let's instead generalize the code a bit so it doesn't have to specfy any PRIME-specific details. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12210>	2022-02-22 10:04:34 +00:00
Iago Toral Quiroga	a4b164b57b	broadcom/compiler: only patch temps that existed before the current spill When we spill we add new temps. We should be careful not to access liveness for these until we have re-computed it after all spills and fill for that the spilled temp have been processed so as to avoid out-of-bounds accesses to the c->temp_start and c->temp_end arrays. This fixes a crash in a Three.js demo when we try to patch register classes after a TMU spill that was caused because we would incorrectly try to patch the same temps we had just added for the spill itself, which is not only unnecessary but also incorrect since we these temps would not have liveness information available yet and thus would cause out of bounds accesses. Fixes: `f3c3228522` ('broadcom/compiler: do not rebuild the interference graph after each spill') Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15107>	2022-02-22 06:41:51 +00:00
Jose Maria Casanova Crespo	90f966e05f	v3dv/v3d: Fix copyright holder to Raspberry Pi Ltd Acked-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15057>	2022-02-18 11:50:07 +01:00
Juan A. Suarez Romero	bfdb1064c5	vc4/ci: make piglit test mandatory Make piglit test jobs to run always, as piglit testsuite offers more coverage for the VC4 driver. On the other hand, make the EGL testing manually, as we don't have enough devices to execute all the tests fast enough. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Acked-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15045>	2022-02-18 09:02:55 +00:00
Iago Toral Quiroga	750eeecf4e	broadcom/compiler: document that spill_base is used for spills and scratch Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>	2022-02-18 08:38:19 +00:00
Iago Toral Quiroga	8883975209	broadcom/compiler: drop spill_count and add spilling boolean We added spill_count to handle uniform batch spills, which we no longer do. What we want now is a way to know if we are spilling registers. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>	2022-02-18 08:38:19 +00:00
Iago Toral Quiroga	f3c3228522	broadcom/compiler: do not rebuild the interference graph after each spill Instead, we only recompute liveness and we add new nodes and interferences to the graph manually (we also need to patch register classes in some cases). To assist in this process, we also add an ip counter to our instructions that we also recompute after each spill, which we use to identify registers that cross thrsw boundries introduced with TMU spills and fills and adjust their register classes accordingly (removing their capacity to use accumulators). This significantly reduces the CPU cost of spills. Using shaders/closed/gputest/piano/7.shader_test as reference: Compile time up to the first successful compile strategy in main is ~24s and with this change it is ~11s. With this speed up, we can now try all 2-thread compile strategies (including the fallback scheduler) in only ~15s. A full shader-db run results in: Total CPU time (seconds): 9904.67 -> 9087.98 (-8.25%) Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>	2022-02-18 08:38:19 +00:00
Iago Toral Quiroga	59caaa7fb3	broadcom/compiler: reset spill/fill counts after lowering thread count. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>	2022-02-18 08:38:19 +00:00
Iago Toral Quiroga	92d819aaa0	broadcom/compiler: fix end of TMU sequence check We may be pipelining TMU writes and reads, in which case we can see both TMUWT and LDTMU at the end of a TMU sequence, so we should not assume that a TMUWT always terminates a sequence. Also, we had a bug where we were using inst instead of scan_inst to check if we find another TMUWT after the curent instruction. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>	2022-02-18 08:38:19 +00:00
Iago Toral Quiroga	40e091267d	broadcom/compiler: define max number of tmu spills for compile strategies Instead of whether they are allowed to spill or not. This is more flexible. Also, while we are not currently enabling spilling on any 4-thread strategies, should we do that in the future, always prefer a 4-thread compile. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>	2022-02-18 08:38:19 +00:00
Iago Toral Quiroga	919aedbfec	broadcom/compiler: choose compile strategy with lowest spilling Until now we would only allow spilling as a last resort in the last 2 strategies, however, it is possible that in some cases earlier strategies may produce less spills if we allowed spilling on them. Likewise, the fallback scheduler can sometimes produce less spills than 2 threads with optimizations disabled. With this change, we start allowing all our 2-thread strategies to spill, and instead of choosing the first strategy that is successful, we choose the one that doesn't spill or the one with the least amount of spilling. It should be noted that this may incur in a significant increase of compile times. We will address this in a follow-up patch. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15041>	2022-02-18 08:38:19 +00:00
Jason Ekstrand	05e9e7767d	vulkan: Rename vk_image_view::format to view_format When I originally added vk_image_view, I was overly clever when it came to the format field. I decided to make it only contain the bits of the format contained in the selected aspects. However, this is confusing (not generally a good thing) and it's also not always what you want. The Vulkan 1.3.204 spec says: "When using an image view of a depth/stencil image to populate a descriptor set (e.g. for sampling in the shader, or for use as an input attachment), the aspectMask must only include one bit, which selects whether the image view is used for depth reads (i.e. using a floating-point sampler or input attachment in the shader) or stencil reads (i.e. using an unsigned integer sampler or input attachment in the shader). When an image view of a depth/stencil image is used as a depth/stencil framebuffer attachment, the aspectMask is ignored and both depth and stencil image subresources are used." So, while the restricted format makes sense for texturing, it doesn't for when the image is being used as an attachment. What we probably actually want is both versions of the format. We'll call the one given by the VkImageViewCreateInfo vk_image_view::format and the restricted one vk_image_view::view_format. This is just the first commit which switches format to view_format so the compiler will make sure we get them all. The next commit will re-add vk_image_view::format but this time unmodified. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15007>	2022-02-16 00:14:50 +00:00
Juan A. Suarez Romero	801d33b2d9	vc4/ci: update failing piglit tests See https://gitlab.freedesktop.org/mesa/mesa/-/issues/6038. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15012>	2022-02-15 09:21:23 +01:00
Jason Ekstrand	29393a40ee	v3dv: Use the common command pool implementation The only interesting information stored in v3dv_cmd_pool is the list of command buffers and that's already tracked by vk_command_pool. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>	2022-02-11 08:06:25 +00:00
Jason Ekstrand	fcad979b72	v3dv: Don't use vk_alloc/free2 for command buffers The pool will always have a valid allocator. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>	2022-02-11 08:06:25 +00:00
Jason Ekstrand	bda4c4f6b6	vulkan: Take a vk_command_pool in vk_command_buffer_init() Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>	2022-02-11 08:06:25 +00:00
Jason Ekstrand	6fb9e4e7ff	v3dv: Use vk_command_pool Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>	2022-02-11 08:06:24 +00:00
Louis-Francis Ratté-Boulianne	5e263cc324	vulkan/runtime: Add a level field to vk_command_buffer Looks like 3 implementations already have that field in their private command_buffer struct, and having it at the vk_command_buffer opens the door for generic (but suboptimal) secondary command buffer support. Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14917>	2022-02-11 08:06:24 +00:00
Juan A. Suarez Romero	7955df28a6	v3dv/ci: Update failure list Add more failing tests to the expected failures. These are obtained after executing the full Vulkan CTS. v2: - Add comments in the tests (Alejandro) Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14948>	2022-02-09 15:46:23 +00:00
Melissa Wen	70a219d4a3	broadcom/simulator: enable multisync in the simulator Use drmSyncobjSignal to signal out_syncobjs when a GPU job submission ends in the simulator. With this, we can enable multisync support in the simulator and keep the multisync approach to process fence by submitting a serialized no-op job that adds the fence to the array of out syncobjs, i.e. syncobjs to be signaled in the kernel when a job completes (job post deps). Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14768>	2022-02-09 07:22:42 +00:00
Emma Anholt	ef112db311	ci: Bump VK-GL-CTS to 1.3.1.0. The main thing is VK 1.3 testing, but also includes test bugfixes. The 1.3 CTS required an uprev of deqp-runner to handle a new style of test output, and that deqp-runner brings in some neat new features, too (piglit in your deqp-runner suite, and extension list checking). A bunch of VK tests got renamed, so I replaced panvk's custom test list with simple include filters on the main test list. Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> (panvk) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14920>	2022-02-08 22:16:36 +00:00
Emma Anholt	3f34251495	ci/broadcom: Remove unused v3dv xfails file. It's actually in broadcom-rpi4-fails.txt. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14920>	2022-02-08 22:16:35 +00:00
Dylan Baker	2f916f2be6	meson: add support for `meson devenv` with vulkan Meson devenv is a feature added in meson 0.58 (thus the features is version guarded) that allows creating a shell environment with environment variables automatically setup for running the project inside the build dir. Some variables (such as LD_LIBRARY_PATH and PATH) are set automatically, others must be added by the project. For vulkan is is relativley simple, we create a new, uninstalled, icd file for each driver and set the VK_ICD_FILENAMES variable appropriately. This can be used with: ```sh meson devenv -C $builddir ``` then, vulkan applications will automatically use the uninstall vulkan driver, no need to install. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14826>	2022-02-04 09:08:47 -08:00
Emma Anholt	a177f0de8f	ci: Uprev vulkan-cts to 1.2.8.0 This brings in some interesting new vulkan tests and fixes for the spurious KHR-GL TF failures. Also, reduces the runtime of dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.36 so that it should stop timing out. Acked-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13779>	2022-02-03 22:41:23 +00:00
Alejandro Piñeiro	5d8c659678	v3d/drm-shim: remove drm-shim driver After starting to use a new version of the simulator, it got outdated. We made some initial effort to update it, but it was not working. Taking into account that no one is using it, it is better to just remove it. We keep the noop drm drivers, as they could have some value for developers that doesn't have access to the v3dv3 simulator. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14682>	2022-02-03 09:53:29 +00:00
Iago Toral Quiroga	7561ea8fa1	broadcom/compiler: allow ldunifa with read-only SSBOs Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14830>	2022-02-03 07:35:07 +00:00
Iago Toral Quiroga	0a8449b07c	broadcom/compiler: fix offset alignment for ldunifa when skipping The intention was to align the address to 4 bytes (32-bit), not 16 bytes. Fixes: `bdb6201ea1` ("broadcom/compiler: use ldunifa with unaligned constant offset") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14830>	2022-02-03 07:35:07 +00:00
Iago Toral Quiroga	ce99b1a746	v3dv: don't submit noop job if there is nothing to wait on or signal Also, do not unconditionally flag signaling for submits without any command buffers. Reviewed-by: Melissa Wen <mwen@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14802>	2022-02-01 07:28:46 +00:00
Melissa Wen	db9098f2ef	v3dv: move sems_info from event_wait job to wait_thread info Semaphores info was stored as an info of event_wait cpu jobs and this leads to mem leak when the same event_wait job in the same cmd buffer batch was submitted more than once. As a result, `dEQP-VK.api.command_buffers.record_simul_use_primary` fails due to a double-free of sems_info. In this patch, we no longer use v3dv_event_wait_cpu_job_info to store semaphores from a submit info, since semaphores is related to a queue submission and not to the event_wait job type. If we spawn a wait_thread, we copy semaphores to an auxiliary struct (v3dv_wait_thread_info) that will be used in wait_thread to get job and semaphores information. When the spawned thread finishes, it releases the related v3dv_wait_thread_info and the semaphores copy as well. Fixes: `d5bd18fb` ("v3dv: store wait semaphores in event_wait_cpu_job_info") Suggested-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Melissa Wen <mwen@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14736>	2022-01-31 23:01:54 +00:00
Thomas H.P. Andersen	430b1157a1	broadcom: drop unused functions Fixes a clang warning about unused static inlined functions Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14790>	2022-01-31 16:10:31 +00:00
Iago Toral Quiroga	5974949c0d	v3dv: expose VK_KHR_depth_stencil_resolve Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14752>	2022-01-28 12:25:43 +00:00
Iago Toral Quiroga	668653f830	v3dv: fallback to blit resolve if render area is not aligned to tile boundaries Just as with all other TLB operations, we can only use the TLB if the render area is aligned to tile boundaries. If it is not, then the operation would overwrite pixels outside the render area, which is not allowed. In this case, we can't even emit a previous TLB load to fix this because the TLB has the multisampled attachment, not the resolve attachment, which is just a destination buffer for the tile store. Because the condition for tile alignment has to be determined for each subpass, we handle this by storing this information in the attachment state of the command buffer with the start of each subpass. We store whether the attachment is to be resolved and whether it can use the TLB (considering tile alignment restrictions). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14752>	2022-01-28 12:25:43 +00:00
Iago Toral Quiroga	7f87a1256e	v3dv: support resolving depth/stencil attachments This implements the bulk of VK_KHR_depth_stencil_resolve Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14752>	2022-01-28 12:25:43 +00:00
Vinson Lee	a97ec3eb13	v3dv: Add missing unlocks on errors. Fix defects reported by Coverity Scan. Missing unlock (LOCK) missing_unlock: Returning without unlocking. Fixes: `a7052dcf2c` ("v3dv: enable multiple semaphores for csd job") Fixes: `ad09e50129` ("v3dv: enable multiple semaphores for tfu job") Fixes: `ff8586c345` ("v3dv: enable multiple semaphores on cl submission") Signed-off-by: Vinson Lee <vlee@freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14663>	2022-01-28 04:15:24 +00:00
Iago Toral Quiroga	764c8867b0	v3dv: document why we don't expose VK_EXT_scalar_block_layout And since this is an optional feature in Vulkan 1.2, fill in the corresponding feature query while we are at it. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14731>	2022-01-27 07:34:19 +00:00
Iago Toral Quiroga	06220a28e7	v3dv: rework Vulkan 1.2 feature queries Fill them into a VkPhysicalDeviceVulkan12Features struct like we do for Vulkan 1.1, and then read them from there. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14731>	2022-01-27 07:34:19 +00:00
Iago Toral Quiroga	692e0dfe27	v3dv: implement VK_KHR_imageless_framebuffer Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14704>	2022-01-27 07:11:20 +00:00
Iago Toral Quiroga	2ee9487ad7	v3dv: drop signature of undefined function This is a left over from when we added multi-version support in the driver, where we turned this helper into a versioned scheme. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14704>	2022-01-27 07:11:20 +00:00
Emma Anholt	d041630a37	ci/llvmpipe,softpipe: Switch piglit testing to piglit-runner. The new runner reduces the runtime by about 1/3 thanks to using rust instead of python, and includes automatic flake handling so you don't just have to skip flaky tests. The wrapper script also includes IRC flake reporting (so one can track and update the flakes list to improve CI reliability), always uploading results to CI for review (so you can diagnose flakes and look at timings), has a prettier regressions report and a helpful timing report, and is the same as what's used by all the HW runners as well. The downside is that by dropping the massive list of skips, you no longer get flagged if Mesa refactors end up accidentally disabling extensions and thus making tests skip. For that, I've started on https://gitlab.freedesktop.org/anholt/deqp-runner/-/merge_requests/33 so that hardware drivers get extension checking coverage too. Thanks to the perf improvement, we get to drop one of the jobs for llvmpipe. xfail lists were mostly sed-jobs from the prior expectations lists. The exceptions to that you'll find in the form of whitespace around the affected test group (usually changes of capitalization or special-characters), or an explanation for the more interesting changes (which thankfully we can now record in the xfails lists!). Reviewed-by: Cristian Ciocaltea <cristian.ciocaltea@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14604>	2022-01-27 04:37:16 +00:00
Iago Toral Quiroga	f666f70935	v3dv: support VK_KHR_8bit_storage Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	5cec893384	broadcom/compiler: update comment on load_uniform fast-path The comment for 16-bit applies to 8-bit uniforms as well. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	296fde31aa	broadcom/compiler: allow vectorization to larger scalar type Allow to vectorize operations from a smaller bit-size into scalar operations of a larger bit-size. This allows us to turn 2x8-bit into a equivalent scalar 16-bit load/store. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	a248ff0b5b	broadcom/compiler: support 8-bit loads via ldunifa This generalizes the support we added for 16-bit to also handle 8-bit loads via ldunifa. The story is the same: we align the address to 32-bit downwards and we skip any bytes that are not of interest. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	4630f5f016	broadcom/compiler: handle to/from 8-bit integer conversions Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	1b530d948d	broadcom/compiler: support 8-bit general store access Just like with 16-bit, this mode only supports scalar access, but we are already lowering all non 32-bit accesses to scalar. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	84adf89d33	v3dv: expose storagePushConstant16 feature from VK_KHR_16bit_storage Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	f7ff462421	broadcom/compiler: support 16-bit uniforms Since ldunif is a 32-bit instruction we need to demote these to UBO loads, like we do for indirect indexing, with the exception of scalar 16bit uniforms with an offset that is 32-bit aligned. For the exception where we can use lfdunif we read a 32-bit slot from memory where the uniform data is in the lower 16-bit and we will read garbage in the upper 16-bit which we won't use anyway. It should be noted that by using ldunif, we are consuming 32-bit from the uniform stream, but this is fine because if there is valid uniform data in the upper 16-bit (i.e. we had a ivec2 uniform aligned to a 32-bit address), since we scalarize 16-bit loads, we would see another load uniform with an unaligned offset for the second component, which we will demote to UBO. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	4f26f50ae4	v3dv: support VK_KHR_16_bit_storage Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	49a8fa152c	broadcom/compiler: support f32 to f16 RTZ and RTE rounding modes These are required by VK_KHR_16bit_storage. Our hardware, however, doesn't provide any mechanism to decide on the rounding mode of the conversion and it seems to be using RTE, so we implement RTZ in software. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	1f639d5310	broadcom/compiler: implement 32-bit/16-bit conversion opcodes Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	bdb6201ea1	broadcom/compiler: use ldunifa with unaligned constant offset If we know we have a load with a constant offset, then even if it is not aligned to 32-bit we can still produce an aligned offset and then skip over the bytes we don't need. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	2eb6910d96	broadcom/compiler: support ldunifa with some 16-bit loads Even though ldunifa is strictly 32-bit we may be able to use it to load 16-bit values that sit at 32-bit aligned addresses. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	2a420bdf92	broadcom/compiler: lower packing after vectorization The vectorization pass can inject 32_2x16 (un)packing opcodes upon successful vectorization of 16-bit operations into 32-bit counterparts, so make sure we lower these to something our backend can handle. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	4b24373137	broadcom/compiler: implement TMU general 16-bit load/store This allows us to implement 16-bit access on uniform and storage buffers. Notice that V3D hardware can only do general access on scalar 16-bit elements, which we currently enforce by running a lowering pass during shader compile. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	2443e45e76	broadcom/compiler: better document vectorization implications Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	765d9feb46	broadcom/compiler: add lowering pass to scalarize non 32-bit general load/store V3D hardware doesn't support vector access for general TMU load/store operations like the ones we use for UBO and SSBO, so we need to split these to scalar operations. It should be noted that we also have a vectorization pass (which runs later, during optimization), that may reconstruct some of these into 32-bit operations when possible (i.e. when the resulting operation is 32-bit aligned). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14648>	2022-01-25 09:08:26 +00:00
Iago Toral Quiroga	a6aa35a091	v3dv: implement VK_KHR_driver_properties Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14680>	2022-01-24 13:56:39 +00:00
Iago Toral Quiroga	be11948a95	broadcom/simulator: handle DRM_V3D_PARAM_SUPPORTS_MULTISYNC_EXT Without this the simulator wrapper will abort upon seeing this query, rendering the driver unusable in that context. Also, it seems the simulator environment doesn't quite work with multisync at present, so do not enable it until we figure out what the issue is. Reviewed-by: Melissa Wen <mwen@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14678>	2022-01-24 11:34:00 +00:00
Alejandro Piñeiro	275a18322d	v3dv: check correct format when load/storing on a depth/stencil buffer When we create a image view with D24S8 format we made a reformatting to RGBA8UI if the aspect selected is just STENCIL. But when configuring the stores we select the aspects based on the attachment format. Quoting from cmd_buffer_render_pass_emit_stores: /* From the Vulkan spec, VkImageSubresourceRange: * * "When an image view of a depth/stencil image is used as a * depth/stencil framebuffer attachment, the aspectMask is ignored * and both depth and stencil image subresources are used." * * So we ignore the aspects from the subresource range of the image * view for the depth/stencil attachment, but we still need to restrict * the to aspects compatible with the render pass and the image. / const VkImageAspectFlags aspects = vk_format_aspects(ds_attachment->desc.format); So we could ending trying to store on a Z+Stencil buffer, using a RGBA8UI format. So far this only affected some tests when using the simulator (assert). Those were working on the real hw, but probably would fail on other situations, so lets use the original image format on that case. v2 (Iago) Improve comment grammar * Do the same on load too (not just store) v3 (Iago) * Re-word comments. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14635>	2022-01-21 13:24:18 +00:00
Alejandro Piñeiro	5d04b76c09	v3dv: remove unused v3dv_descriptor_map_get_texture_format Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14635>	2022-01-21 13:24:18 +00:00
Melissa Wen	9319ffb53d	v3dv: signal fence when all submitted jobs complete execution We track last submitted jobs by queue type. After all cmd buffer batches have been submitted, we emit a noop job that waits all jobs submitted to each GPU queue complete and signals the fence. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Melissa Wen	bce77e758a	v3dv: process signal semaphores in the very last job With multiple semaphores support, we can use a GPU job to handle multiple signal semaphores in the end of a cmd buffer batch. It means, the last job in the last cmd buffer will be in change of signalling semaphores as long as it meets some conditions: 1 - A GPU-job signals semaphores whenever we only have submitted jobs for the same queue (there is no syncobj created for any other type). Otherwise, we emit a noop job that waits on the completion of all jobs submitted and then signals semaphores. 2 - A CPU-job is never in charge of signalling semaphores. We process it first and emit a noop job that depends on all jobs previously submitted to signal semaphores. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Melissa Wen	0ab98612ef	v3dv: handle wait semaphores in the first job by queue With multiple semaphore support, we can improve the way we handle wait semaphores considering different job types and cmd buffer batch scenarios, that means: - A GPU job depends on wait semaphores whenever it is the first job submitted to a queue in this command buffer batch (the `first` flag for the job's queue type is set). - For the first CPU job, if there are wait semaphores, we should wait for the CPU and GPU being idle to process the job. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Melissa Wen	03a6a82740	v3dv: track submitted jobs by GPU queue type The order in which a GPU job is scheduled is guaranteed within the same queue type (CL, TFU, CSD), but the order of completion of jobs from different queues cannot be guaranteed. Since we have multiple semaphores support now, we can track the completion of the last job submitted to each queue and therefore better determine when gpu is idle. We do it using an array of syncobj (last_job_syncs) for each GPU queue (CL, TFU, CSD). With this, job serialization also become more accurate. We also keep tracking the very last job submitted (last_job_sync became an element of the last_job_syncs array as V3DV_QUEUE_ANY) for the case we don't have multisync support. To help in handling wait semaphores, we set a flag per queue to indicate we are starting a new cmd buffer batch and a job submitted to this queue will be the first. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Melissa Wen	fd973218a6	v3dv: enable GPU jobs to signal multiple semaphores In addition to keep a copy of wait semaphores, extend v3dv_submit_info_semaphores to hold a copy of signal semaphores too. With a copy of wait and signal semaphores, we can enable GPU jobs to handle more than one wait and signal semaphores. By now, we don't change the way as we signalling semaphores when all jobs complete, i.e., we still use the master thread to signal semaphores. In this context, no GPU job is actually in charge of signalling, but the support for multiple signal semaphores is done here. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Melissa Wen	a7052dcf2c	v3dv: enable multiple semaphores for csd job Whenever v3d kernel-driver supports multisync extension, use it to allow add multiple semaphores as csd job dependency. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Melissa Wen	ad09e50129	v3dv: enable multiple semaphores for tfu job Whenever v3d kernel-driver supports multisync extension, use it to enable more than one semaphores in a tfu job. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Melissa Wen	ff8586c345	v3dv: enable multiple semaphores on cl submission Whenever v3d kernel-driver supports multisync extension, use it to enable more than one semaphores in cl submission. In CL, we have two kind of job (bin and render), therefore, we need also to determine the stage to sync, that means to add job dependencies/wait semaphores. Also, as we currently process all signal semaphores of a cmd buffer batch together in the submit master thread (when the last wait thread completes), there isn't now a situation in which GPU jobs need to handle signal VkSemaphores. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Melissa Wen	85c49db10d	v3dv: check multiple semaphores capability Check if kernel-driver supports multisync extension Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Melissa Wen	d5bd18fbb3	v3dv: store wait semaphores in event_wait_cpu_job_info Instead of a boolean (sem_wait) in v3dv_event_wait_cpu_job_info, that is used to determine wait condition for jobs put in a wait thread, copy the wait semaphores data and store it as struct v3dv_submit_info_semaphores. In the following patches we enable multiple semaphores in GPU jobs, and therefore we need this data to add wait semaphores as job dependencies for pending jobs submitted from a wait thread. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Melissa Wen	d148379edf	v3dv: wrap wait semaphores info in v3dv_submit_info_semaphores Instead of pass pSubmit to queue_submit_cmd_buffer, create a struct v3dv_submit_info_semaphores to wrap semaphores data from VkSubmitInfo. In the next commit, this struct will help to handle wait condition for jobs submitted in a wait event context, since we need to hold this data when handle wait events and pass it to queue_submit_job() called from wait threads. The main goal is to allow multiple wait semaphores in a job submission. Later, this struct will be extended to include a copy of signal semaphores too. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Melissa Wen	09991fc47b	v3dv: drop unused variable on handle_set_event_cpu_job is_wait_thread is passed, but not actually used; and cpu_queue_handle_idle is in charge to handle wait threads spawned before this one. Signed-off-by: Melissa Wen <mwen@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13178>	2022-01-21 10:59:17 +00:00
Charles Giessen	6ea7a61d7a	v3dv: Update LoaderICDInterfaceVersion to v5 With the proper version checking in the common vulkan instance code (commit `88b9b68`) it is now possible to bring the reported interface version up to v5. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14563>	2022-01-20 07:25:07 +00:00
Dave Airlie	ccbf700d6c	nir: remove gl.h include from nir headers. This saves a lot of pointless gl.h includes across the board, it moves the one place that needs GLenum into a separate file only used in those passes that require it. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>	2022-01-19 21:54:58 +00:00
Dave Airlie	1352e0ba0c	mesa/*: add a shader primitive type to get away from GL types. This creates an internal shader_prim enum, I've fixed up most users to use it instead of GL types. don't store the enum in shader_info as it changes size, and confuses other things. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>	2022-01-19 21:54:58 +00:00
Chia-I Wu	37fa59fa6c	anv,lavapipe,v3dv: use wsi_common_get_image Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (anv) Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> (v3dv) Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> (lavapipe) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14544>	2022-01-14 17:41:42 +00:00
Danylo Piliaiev	fa75b2a027	vulkan/wsi: create a common function to compare drm devices Effectively moves most of v3dv_wsi_can_present_on_device to the common code to be used in other drivers. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11091>	2022-01-14 12:19:57 +00:00
Iago Toral Quiroga	b9f9474577	v3dv: implement double-buffer mode Double buffer mode splits the tile buffer size in half so we can start processing the next tile while the current one is being stored to memory. This mode is available only if MSAA is not enabled and can, in theory, improve performance by reducing tile store overhead, however, it comes at the cost of reducing the tile size, which also causes some overhead of its own. Testing shows that this helps some cases (i.e the Vulkan Quake ports) but hurts others (i.e. Unreal Engine 4), so for the time being we don't enable this by default but we allow to enable it selectively by using V3D_DEBUG. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14551>	2022-01-14 10:57:26 +00:00
Iago Toral Quiroga	fbe4d7ccf4	v3dv: implement VK_EXT_4444_formats Because these formats are introduced trough an extension, their enum values are exceedingly large and we cannot use them to index directly into the format table we had for core formats. Instead, we put these in a separate table and we always use the VK_ENUM_OFFSET helper to index into these tables. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14533>	2022-01-14 10:10:10 +00:00
Iago Toral Quiroga	25c46c465d	v3dv: handle formats with reverse flag Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14533>	2022-01-14 10:10:10 +00:00
Iago Toral Quiroga	872f08815b	v3dv: add swizzle helpers to identify formats wit R/B swap and reverse flags Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14533>	2022-01-14 10:10:10 +00:00
Charles Giessen	5a37cc1186	v3dv: Update LoaderICDInterfaceVersion to v4 vk_icdNegotiateLoaderICDInterfaceVersion now correctly identifies the driver as supporting v4. Before, the driver did support the functionality but didn't report supporting it through the negotiate function. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14299>	2022-01-14 10:25:50 +01:00
Alejandro Piñeiro	d7cbe17760	v3dv: simplify v3dv_debug_ignored_stype mesa_logd already handles having or not DEBUG defined, and also has a better empty option. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14512>	2022-01-13 13:18:05 +00:00
Christian Gmeiner	41179b665b	broadcom/ci: use .test-manual-mr Allow the jobs to be available for MRs. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14361>	2022-01-12 23:19:22 +00:00
Christian Gmeiner	6e08d8fc3d	ci: Uprev piglit to af1785f31 Brings in these changes: af1785f31 occlusion_query_conform: skip GetQueryCounterBits test if needed dad078717 occlusion_query_conform: convert to pilgit subtests b52c1c761 glsl-1.30: test nested preprocessor concat 6c4da153b texture-storage: Fix subtest result handling of skips. 4343f19db fbo-integer: Remove the invalid DrawPixels test. e3842f2fe arb_dsa: exclude stencil8 textures from test sets. ce8649be7 spec/ext_external_objects: Fix build on Debian systems 4e553838f glsl: add basic tests for desktop GLSL invariant qualifier linking 7e61e5199 Tests for variable in and out of loop scope f855ad1c8 fbo-mrt-alphatest: Only require GLSL 1.20 9be2fe999 glx: add glx-multi-display-single-pbuffer test bfe290725 glx: add glx-swap-pbuffer test efa64335e framework: Fix build on Windows when using waffle Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14468>	2022-01-10 21:52:42 +00:00
Jason Ekstrand	ca1d0333db	v3dv: Use the common QueueSignalReleaseImageANDROID from RADV This is an actual functional change as we now plumb through the sync FD instead of doing a vkQueueSubmit and trusting in implicit sync. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Tested-by: Roman Stratiienko <r.stratiienko@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14372>	2022-01-05 16:36:10 +00:00
Jason Ekstrand	dfb1e1777c	anv,radv,v3dv: Move AcquireImageANDROID to common code All three implementations are identical. Reviewed-by: Eric Engestrom <eric@engestrom.ch> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Tested-by: Tapani Pälli <tapani.palli@intel.com> Tested-by: Roman Stratiienko <r.stratiienko@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14372>	2022-01-05 16:36:10 +00:00
Iago Toral Quiroga	44fa8304d4	v3dv: add a refcount mechanism to BOs Until now we have lived without a refcount mechanism in the driver because in Vulkan the user is responsible for handling the life span of memory allocations for all Vulkan objects, however, imported BOs are tricky because the kernel doesn't refcount so user-space needs to make sure that: 1. When importing a BO into the same device used to create it (self-importing) it does not double free the same BO. 2. Frees imported BOs that were not allocated through the same device. Our initial implementation always freed BOs when requested, so we handled 2) correctly but not 1) and we would double-free self-imported BOs. We tried to fix that in commit `d809d9f3` but that broke 2) and we started to leak BOs for some imports. This fixes the problem for good by adding refcounts to BOs so that self-imported BOs have a refcnt > 1 and are only freed when all references are freed. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5769 Tested-by: Roman Stratiienko <r.stratiienko@gmail.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14392>	2022-01-05 12:22:45 +00:00
Thomas H.P. Andersen	c32c9014f5	broadcom/compiler: fix compile warning -Wabsolute-value fixes a compile warning with clang Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14302>	2022-01-03 20:20:37 +00:00
Roman Stratiienko	2686c5419d	v3dv: add Android support Acknowledgements to android-rpi team and lineage-rpi maintainer (KonstaT) for creating/testing initial vulkan support. Their experience was used as a baseline for this work. Most of the code is a copy of turnip and anv. Improved by cleaning dEQP failures: - Improved gralloc support (use allocation time stride, size, modifier). - Fixed some dEQP crashes due to memory allocation issues. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Signed-off-by: Roman Stratiienko <r.stratiienko@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14016>	2021-12-21 09:24:43 +00:00
Juan A. Suarez Romero	bc11dc7187	broadcom/ci: restructure expected results Sort/rename the files so expected tests are classified by device. No need to split the tests by driver (e.g., V3D vs V3DV). Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13983>	2021-12-17 09:15:34 +00:00
Pierre-Eric Pelloux-Prayer	1cb5c1775b	glx: fix querying GLX_FBCONFIG_ID for Window This commit fixes apps using the following sequence: 1. XCreateWindow(dpy) -> win 2. glXCreateContextAttribsARB(dpy, ...) -> ctx 3. glXMakeCurrent(dpy, win, ctx) 4. glXQueryDrawable(dpy, win, GLX_FBCONFIG_ID, ...) glXQueryDrawable returned 0 (while correctly returning a valid GLXFCONFIG_ID for other types of drawables). This commit adds the same dance as driInferDrawableConfig to get the GLX visual from the Window, and then the GLXFBCONFIG_ID of this visual. This fixes: * piglit: glx-query-drawable --attr=GLX_FBCONFIG_ID --type=WINDOW * Maya which uses the config ID from step 4 as an input to glXChooseFBConfig. Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14174>	2021-12-16 01:21:36 +00:00
Alejandro Piñeiro	1c4f76672d	broadcom/compiler: avoid unneeded sint/unorm clamping when lowering stores They are being used on integer to integer stores. From Vulkan sec, final paragraph of 16.4.4 "Texel Output Format Conversion": "Each component is converted based on its type and size (as defined in the Format Definition section for each VkFormat). ... Integer outputs are converted such that their value is preserved. The converted value of any integer that cannot be represented in the target format is undefined." I didn't find a equivalent quote for OpenGL as all conversion entries are forcused on float to integer, fixed-point to integer, etc, and not on integer to integer. Didn't find any test failure with this change. We didn't get any shader-db stats change with shaderdb (even overriding to OpenGL 4.4 to get more shaders built), so as a reference Vulkan shader-db stats with the pattern dEQP-VK.image..with_format..* total instructions in shared programs: 37534 -> 36522 (-2.70%) instructions in affected programs: 12080 -> 11068 (-8.38%) helped: 241 HURT: 0 Instructions are helped. total uniforms in shared programs: 9100 -> 8550 (-6.04%) uniforms in affected programs: 3004 -> 2454 (-18.31%) helped: 229 HURT: 0 total max-temps in shared programs: 6110 -> 6014 (-1.57%) max-temps in affected programs: 402 -> 306 (-23.88%) helped: 43 HURT: 0 Max-temps are helped. total nops in shared programs: 1523 -> 1526 (0.20%) nops in affected programs: 21 -> 24 (14.29%) helped: 3 HURT: 6 Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14194>	2021-12-15 11:53:20 +00:00
Roman Stratiienko	2cbbfd23ce	v3dv: Hotfix: Rename remaining V3DV_HAS_SURFACE->V3DV_USE_WSI_PLATFORM This was somehow missed by me and during review. Fixes fcfc4ddfccd5: ("v3dv: Fix V3DV_HAS_SURFACE preprocessor condition") Signed-off-by: Roman Stratiienko <roman.o.stratiienko@globallogic.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14190>	2021-12-14 10:33:28 +00:00
Iago Toral Quiroga	2630c8f546	broadcom/compiler: improve thrsw merge Instead of stopping the merge process when we find an instruction with an incompatible signal (such as an small immediate), keep going and see if we can merge the thrsw in a previous instruction that is compatible. total instructions in shared programs: 13409835 -> 13356648 (-0.40%) instructions in affected programs: 3556860 -> 3503673 (-1.50%) helped: 17457 HURT: 18 Instructions are helped. total max-temps in shared programs: 2353971 -> 2352956 (-0.04%) max-temps in affected programs: 13960 -> 12945 (-7.27%) helped: 703 HURT: 0 Max-temps are helped. total spills in shared programs: 12301 -> 12301 (0.00%) total sfu-stalls in shared programs: 32596 -> 32499 (-0.30%) sfu-stalls in affected programs: 225 -> 128 (-43.11%) helped: 79 HURT: 3 Sfu-stalls are helped. total nops in shared programs: 347204 -> 325234 (-6.33%) nops in affected programs: 99834 -> 77864 (-22.01%) helped: 11515 HURT: 158 Nops are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14172>	2021-12-14 09:50:17 +00:00
Roman Stratiienko	fcfc4ddfcc	v3dv: Fix V3DV_HAS_SURFACE preprocessor condition Currently V3DV_HAS_SURFACE is always defined. There is no WSI for Android in mesa3d, therefore WSI related extensions should not be exposed. 1. Define V3DV_HAS_SURFACE only for platforms which has WSI implemented. 2. Rename V3DV_HAS_SURFACE -> V3DV_USE_WSI_PLATFORM to align naming with other platforms. Fixes dEQP-VK.wsi.android.surface#query_protected_capabilities Fixes: `79e4451430` ("v3dv: move extensions table to v3dv_device") Signed-off-by: Roman Stratiienko <roman.o.stratiienko@globallogic.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14144>	2021-12-13 07:11:20 +00:00
Khem Raj	249556dad8	v3dv: account for 64bit time_t on 32bit arches This makes is a bit more portable, especially on 32bit architectures with 64bit time_t defaults. Especially on musl its a must. Fixes ../mesa-21.3.0/src/broadcom/vulkan/v3dv_bo.c:71:15: error: format specifies type 'long' but the argument has type 'time_t' (aka 'long long') [-Werror,-Wformat] time.tv_sec); ^~~~~~~~~~~ Also reported here [1] [1] https://github.com/agherzan/meta-raspberrypi/issues/969 Signed-off-by: Khem Raj <raj.khem@gmail.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14118>	2021-12-10 22:26:56 +00:00
Roman Stratiienko	72db15913f	v3dv: Fix dEQP-VK.info#instance_extensions test When mesa3d is built without VK_USE_PLATFORM_DISPLAY_KHR definition, dEQP test fails: dEQP : Test case 'dEQP-VK.info.instance_extensions'.. dEQP : Fail (Extension VK_KHR_get_display_properties2 is missing dependency: VK_KHR_display) dEQP : DONE! Enable KHR_get_display_properties2 only if VK_USE_PLATFORM_DISPLAY_KHR is enabled. Fixes: `f884c2e3be` ("v3dv: implement VK_KHR_get_display_properties2") Signed-off-by: Roman Stratiienko <roman.o.stratiienko@globallogic.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14047>	2021-12-08 21:34:47 +00:00
Juan A. Suarez Romero	11287475c8	v3d: enable ARB_texture_view v2 (Iago): - Add comments to failing tests Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>	2021-12-03 15:32:36 +00:00
Alejandro Piñeiro	7f1525f086	v3d: enable ARB_texture_buffer_object and ARB_texture_buffer_range Through their specific PIPE_CAP. v2 (Iago) - Add comment in test failure Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>	2021-12-03 15:32:36 +00:00
Juan A. Suarez Romero	fd47c939f4	st/pbo: add the image format in the download FS In the V3D driver there is a NIR lowering step for `image_store` intrinsic, where the image store format is required for doing the proper lowering. Thus, let's define it for the download FS instead of keeping it as NONE. v2 (Illia) - Use format only for drivers not supporting format-less writing. v4 (Illia): - Use PIPE_CAP_IMAGE_STORE_FORMATTED to reduce combinations. v5 (Ilia): - Use indirect array for download FS in not formatless-store support drivers. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13409>	2021-12-03 15:32:36 +00:00
Iago Toral Quiroga	cc7db1fc53	broadcom/compiler: improve documentation for Z writes Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>	2021-12-03 10:39:08 +00:00
Iago Toral Quiroga	d7b79f3531	v3d,v3dv: don't disable EZ for passthrough Z writes The early-Z test uses Z values produced from FEP, so when we write Z from a shader we need to disable EZ. However, there are some instances where want to write the FEP-Z from the shader, in which case we would not need to disable EZ. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>	2021-12-03 10:39:08 +00:00
Iago Toral Quiroga	a65c605365	broadcom/compiler: track passthrough Z writes In some cases we need to make the shaders write the Z value produced from rasterization (FEP). Track these instances because they are relevant to early EZ setup. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>	2021-12-03 10:39:08 +00:00
Iago Toral Quiroga	6d4a645c90	broadcom/compiler: emit passthrough Z write if shader reads Z Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14037>	2021-12-03 10:39:08 +00:00
Qiang Yu	fcc062235c	ci: remove egl-copy-buffers from fail list egl-copy-buffers test has been fixed for dri3. So remove it from broadcom and freedreno ci fail list to prevent the gitlab ci test fail: spec@egl 1.4@egl-copy-buffers,UnexpectedPass Also remove it from radeonsi ci fail list since I verified on radeonsi. Acked-by: Daniel Stone <daniels@collabora.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Signed-off-by: Qiang Yu <yuq825@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13868>	2021-11-30 01:58:42 +00:00
Ilia Mirkin	e31d08d307	ci: move windowoverlap exclusion to all-skips The test is just plain not built by our containers. Skip it everywhere. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13919>	2021-11-29 18:08:49 -05:00
Iago Toral Quiroga	996f147fef	broadcom/compiler: relax restriction on VPM inst in last thread end slot According to the documentation, only vpmwt is disallowed in the last delay slot of the thread end. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13975>	2021-11-29 14:06:43 +00:00
Iago Toral Quiroga	6923dd687c	broadcom/compiler: allow color TLB writes in last instruction Only Z writes are disallowed. total instructions in shared programs: 11578449 -> 11577369 (<.01%) instructions in affected programs: 38132 -> 37052 (-2.83%) helped: 1080 HURT: 0 Instructions are helped. total max-temps in shared programs: 2334416 -> 2334395 (<.01%) max-temps in affected programs: 218 -> 197 (-9.63%) helped: 21 HURT: 0 Max-temps are helped. total inst-and-stalls in shared programs: 11607890 -> 11606810 (<.01%) inst-and-stalls in affected programs: 38265 -> 37185 (-2.82%) helped: 1080 HURT: 0 Inst-and-stalls are helped. total nops in shared programs: 338316 -> 337236 (-0.32%) nops in affected programs: 2625 -> 1545 (-41.14%) helped: 1080 HURT: 0 Nops are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13964>	2021-11-29 06:44:07 +00:00
Alejandro Piñeiro	a9b4aef0f2	broadcom/compiler: make shaderdb debug output compatible with shaderdb's report tool Even although the option is called shaderdb, it is not really used by shaderdb (for V3D shaderdb uses the debug option "precompile"). And in fact, right now the output format is not compatible with shaderdb. This commit tries to fix that, and as we are here, also try to make the option more useful for the Vulkan case, as that debug option also works with v3dv. We can't really fully imitate shaderdb use with OpenGL (run with a set of glsl shader tests), but we can at least assign a unique name (the pipeline sha1 in text format) so we can compare executions of the same vulkan application. For that remember to disable the on-disk cache. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13938>	2021-11-24 13:02:08 +00:00
Iago Toral Quiroga	79dee14cc2	broadcom/compiler: don't move ldvary earlier if current instruction has ldunif If we did, we would have the instruction coming right after ldvary write to the same implicit destination as ldvary at the same time. We prevent this when merging instructions, but we should make sure we prevent this when we move ldvary around for pipelining too. Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13921>	2021-11-23 10:52:24 +00:00
Iago Toral Quiroga	7fec4f4135	broadcom/compiler: fix scoreboard locking checks According to the spec the hardware locks the scoreboard on the first or last thread switch (selected via shader state) and any TLB accesses executed before this are not synchronized by hardware. This change updates the logic to ensure we respect this requirement and that we don't assume that the lock is acquired automatically on the first TLB access, which is not valid at least since V3D 4.1+. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13910>	2021-11-22 12:53:43 +00:00
Iago Toral Quiroga	bd7584c16b	broadcom/compiler: don't allow RF writes from signals after thrend Writes to physical registers are not allowed after thread end. We were checking this for ALU writes, but we need to check it for signal writes too. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13910>	2021-11-22 12:53:43 +00:00
Juan A. Suarez Romero	457dbb81f5	broadcom/compiler: apply constant folding on early GS lowering This solves a case where a NIR geometry shader was storing the output in a non-constant: vec4 32 ssa_1 = load_const (0xc0800000 /* -4.000000 /, 0xc1100000 / -9.000000 /, 0x40400000 / 3.000000 /, 0x40e00000 / 7.000000 /) vec1 32 ssa_7 = load_const (0x00000000 / 0.000000 /) vec1 32 ssa_8 = load_const (0x00000001 / 0.000000 /) vec1 32 ssa_9 = iadd ssa_7, ssa_8 vec1 32 ssa_19 = mov ssa_1.x intrinsic store_output (ssa_19, ssa_9) (1, 1, 0, 160, 288) / base=1 / / wrmask=x / / component=0 / / src_type=float32 / / location=32 slots=2 gs_streams(x=0 y=0 z=0 w=0) / When lowering the VPM output we check if the destination (ssa_9 in this case) is a constant to add to the VPM offset. We run a constant folding optimization in an earlier VS lowering, and we should do the same for GS. This fixes multiple dEQP-VK.pipeline.interface_matching. failures. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13884>	2021-11-22 09:32:50 +00:00
Juan A. Suarez Romero	7b21635057	broadcom/compiler: handle array of structs in GS/FS inputs While fragment and geometry shader were handling structs as inputs, they weren't doing for it arrays of structures. This fixes multiple dEQP-VK.pipeline.interface_matching.* failures and assertions. v2: - Fix style (Iago). Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13884>	2021-11-22 09:32:50 +00:00
Alejandro Piñeiro	ff89dc3523	vulkan: move common format helpers to vk_format v3dv, radv, and turnip are using several C&P format helpers (most of them wrappers over util_format_description based helpers). methods. This commit moves the common helpers to the already existing common vk_format.h. For the case of v3dv we were able to remove the vk_format header. For turnip and radv, a local vk_format.h header remains, with methods that are only used for those drivers. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13858>	2021-11-19 12:23:19 +01:00
Iago Toral Quiroga	5e536c97a9	broadcom/compiler: fix early fragment tests setup When early fragment tests are mandated by the shader, we must use the Z value produced by the FEP even if there are elements that would typically require late fragment tests (such as discards, sample to coverage, etc). This change means we also need to be a bit more careful when we promote shaders to use early fragment tests so we don't promote anything with discards for example. Fixes: dEQP-VK.fragment_operations.early_fragment.discard_early_fragment_tests_depth dEQP-VK.fragment_operations.early_fragment.discard_early_fragment_tests_stencil Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13837>	2021-11-18 07:39:32 +00:00
Connor Abbott	508f917d8c	util/dag: Make edge data a uintptr_t Nobody was actually using it as a pointer, and I'm going to introduce a shared function which relies on it not being a pointer so let's fix this once and for all. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>	2021-11-17 13:41:47 +00:00
Alejandro Piñeiro	cbf0d83eac	v3d,v3dv: move TFU register definition to a common header We are using the same definitions for both OpenGL and Vulkan, so let's move it to common. As we are here we are also adding versioning on the TFU register definition. Those are basically register bit places, so really likely to change between versions. Adding 33 as it is the first version they got defined. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13832>	2021-11-17 11:04:31 +01:00
Iago Toral Quiroga	836a4b5836	v3dv: fix internal bpp of D/S formats Depth/stencil formats can, at worse (d32/d24s8), be exactly 32bpp, which is the minimum we can program for the internal format. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13816>	2021-11-17 06:57:48 +00:00
Iago Toral Quiroga	f384c763fc	v3d,v3dv: move tile size calculation to a common helper We had this code replicated in 3 places across both drivers. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13790>	2021-11-15 11:40:39 +00:00
Iago Toral Quiroga	7490bcad37	v3dv: don't use a global constant for default pipeline dynamic state Some of these may change across V3D versions, so it is not practical. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13775>	2021-11-12 11:04:07 +00:00
Iago Toral Quiroga	4b3931ee6c	v3dv: account for multisampling when computing subpass granularity The granularity is defined by the tile size, which is also determined by multisampling. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13775>	2021-11-12 11:04:07 +00:00
Iago Toral Quiroga	0cb58f80d2	v3d: use V3D_MAX_DRAW_BUFFERS instead of hardcoded constant Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13775>	2021-11-12 11:04:07 +00:00
Christian Gmeiner	a0634a3c85	ci/bare-metal: switch to common .baremetal-test-arm64 Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13751>	2021-11-12 08:22:29 +00:00
Christian Gmeiner	8bc284fe5b	ci/bare-metal: armhf: move BM_ROOTFS to generic place Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13751>	2021-11-12 08:22:29 +00:00
Emma Anholt	a68a0c9e1c	mesa/st: Disable NV_copy_depth_to_color on non-doubles-capable HW. The previous doubles check (https://gitlab.freedesktop.org/mesa/mesa/-/issues/3459) checked that you didn't have full doubles emulation turned on, but we also need to check that you have doubles at all (emulated or not) or non-GL4 drivers will fail. Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13743>	2021-11-11 16:38:58 +00:00
Neil Roberts	bdaf185889	v3d: Update prim_counts when prims generated query in flight without TF In order to implement GL_PRIMITIVES_GENERATED, v3d allocates a small resource and adds a command to the job to store the prim counts to it. However it was only doing this when TF was enabled which meant that if the query was used with a geometry shader but no TF then the query would always be zero. This patch makes the driver keep track of how many PRIMITIVES_GENERATED queries are in flight and then enable writing the prim count if its more than zero. Fix dEQP-GLES31.functional.geometry_shading.query.primitives_generated_* v2: Update CI expectations and references to fixed tests in commit log. v3: - Add comment that GL_PRIMITIVES_GENERATED query is included because OES_geometry_shader, but it is not part of OpenGL ES 3.1. (Iago) - Update Fixes to commit introducing geometry shaders. (Iago) Fixes: `a1b7c084` ("v3d: fix primitive queries for geometry shaders") Signed-off-by: Neil Roberts <nroberts@igalia.com> Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13712>	2021-11-11 08:02:04 +00:00
Iago Toral Quiroga	3a95e25e84	v3dv,v3d: don't store swizzle pointer in shader/pipeline keys We had been storing pointers to a driver owned swizzle table rather than storing the actual swizzle value in various shader and pipeline keys on both GL and Vulkan drivers. This doesn't look very robust, particularly since we also compute sha1 hashes from these values and we may store these hashes to disk (for the disk cache). Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13738>	2021-11-10 11:24:26 +00:00
Iago Toral Quiroga	aa5a0e1dad	broadcom/compiler: copy packing when converting add to mul Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13675>	2021-11-04 13:57:39 +00:00
Iago Toral Quiroga	a794bdf953	broadcom/compiler: check that sig packing is valid when pipelining ldvary Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13641>	2021-11-03 10:49:06 +00:00
Emma Anholt	4e28962800	ci: Uprev VK-GL-CTS to 1.2.7.2, and pull in piglit while I'm here. The VK-GL-CTS fixes some issues for freedreno, and almost all of LVP's xfails. Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13622>	2021-11-02 20:29:31 +00:00
Iago Toral Quiroga	6b9bd3f038	broadcom/compiler: make opt passes set current block Typically, optimization passes go through all the blocks in a shader and make adjustments on the fly, so we always want them to update the current block or the current block pointer will become outdated. Also, we don't need to keep track of the previous current block pointer to restore it, since optimization passes run after we have completed conversion to VIR, and therefore, anything that comes after that should always set the current block before emitting code. Fixes debug assert crashes when running shader-db: vir.c:1888: try_opt_ldunif: Assertion `found \|\| &c->cur_block->instructions == c->cursor.link' failed Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13625>	2021-11-02 11:17:01 +00:00
Ella Stanforth	3c86292321	v3dv: Implement VK_KHR_create_renderpass2 Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13575>	2021-11-02 07:43:31 +00:00
Jason Ekstrand	4108fda426	vulkan: Move all the common object code to runtime/ Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13156>	2021-10-29 23:12:32 +00:00
Jason Ekstrand	5ccba1576d	v3dv: Use vk_instance_get_proc_addr_unchecked for WSI It exists precisely to handle this case without the driver looking up trampolines itself. This is nearly identical to what ANV does. Reviewed-by: Dylan Baker <dylan@pnwbakers.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13156>	2021-10-29 23:12:32 +00:00
Iago Toral Quiroga	b42f4b8809	broadcom/compiler: padding fixes to QPU assembly dumps When there are dst/src modifiers it is pretty common that instructions take too much space and lead to alignment issues that make code a lot harder to read, so align the MUL and SIG columns a bit wider to avoid this: Before: 0x380021828003faa8 fmax rf2, rf42.abs, rf40.abs; nop 0x3800f186c503f0f0 fcmp.pushc -, rf3, rf48; nop 0x380c038b85b83282 fmax rf11, rf10, rf2; mov.ifa rf14, rf46 0x3800219ab503f359 and rf26, rf13, rf25; nop 0x3820f186c503f2f0 fcmp.pushc -, rf11, rf48; nop ; thrsw 0x382c013fb5b8368e and rf63, rf26, rf14; mov.ifa rf4, rf46; thrsw 0x38002185b503ffc4 and rf5, rf63, rf4 ; nop 0x38002186b503f141 and rf6, rf5, rf1 ; nop 0x382031873503f186 vfpack tlb, rf6, rf6; nop ; thrsw 0x380031873503f18f vfpack tlb, rf6, rf15; nop 0x38003186bb03f000 nop ; nop After: 0x380021828003faa8 fmax rf2, rf42.abs, rf40.abs ; nop 0x3800f186c503f0f0 fcmp.pushc -, rf3, rf48 ; nop 0x380c038b85b83282 fmax rf11, rf10, rf2 ; mov.ifa rf14, rf46 0x3800219ab503f359 and rf26, rf13, rf25 ; nop 0x3820f186c503f2f0 fcmp.pushc -, rf11, rf48 ; nop ; thrsw 0x382c013fb5b8368e and rf63, rf26, rf14 ; mov.ifa rf4, rf46 ; thrsw 0x38002185b503ffc4 and rf5, rf63, rf4 ; nop 0x38002186b503f141 and rf6, rf5, rf1 ; nop 0x382031873503f186 vfpack tlb, rf6, rf6 ; nop ; thrsw 0x380031873503f18f vfpack tlb, rf6, rf15 ; nop 0x38003186bb03f000 nop ; nop Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13545>	2021-10-28 08:12:14 +00:00
Emma Anholt	bfbc41a9fa	ci/piglit-runner: Merge piglit-driver-.txt files into driver-.txt. The test names are definitely unique (deqp has specific prefixes, piglit uses '@' as a separator instead of '.'), so we can just have a single file regardless of test type. Merges the two groups of xfails together so you can't mix up which file to edit (I certainly have), and so that we don't need to introduce yet another set of files when we add gtest for libva. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13517>	2021-10-27 20:54:11 +00:00
Emma Anholt	38dff02bfb	ci/deqp-runner: Rename the deqp-drivername-.txt files to drivername-.txt We have two testsuites with the same format for fails/flakes/skips files, and test names that are definitely unique. As I'm about to add a third testsuite (gtest for libva-utils), so let's have just one file each for fails/flakes/skips instead of one per type of testsuite. This starts the move with just the bulk rename of deqp. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Daniel Stone <daniels@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13517>	2021-10-27 20:54:11 +00:00
Iago Toral Quiroga	0a277fabce	broadcom/compiler: fix condition encoding bug When both AC and MC are set, AC is encoded in bits 0..1 not 0..3. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13527>	2021-10-27 06:03:12 +00:00
Iago Toral Quiroga	3fbd6662b7	broadcom/compiler: rework simultaneous peripheral access checks This was not quite correct in that our checks for the allowed cases were not checking that there were no other peripheral access other than the ones allowed. For example, we allowed wrtmuc signal and TMU write other than TMUC, and we also allowed TMU read and VPM read/write. But we cannot allow wrtmuc with TMU write other than TMUC and at the same time a VPM write for example, so we can't just check if we have a combination of allowed peripherals, we still need to check that those are the only ones in use by the combined instructions. Another example is that even if we allow a TMU write (other than TMUC) with a wrtmuc signal, the resulting instruction must still have just one TMU write other than TMUC, but we were allowing the merge if one instruction signaled wrtmuc and the other wrote to tmu other than tmuc without testing if the combined result would have 2 tmu writes. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13527>	2021-10-27 06:03:12 +00:00
Iago Toral Quiroga	ceaf56920c	v3dv: refactor TFU jobs We had an implementation for image copies and another for buffer to image copies. Refactor the code so we have a single implementation of this. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13481>	2021-10-22 11:05:33 +00:00

... 2 3 4 5 6 ...

2106 Commits