Commit Graph

144397 Commits

Author SHA1 Message Date
Samuel Pitoiset 915bc6a179 radv: use RADEON_FLAG_VA_UNCACHED for the trace BO
Figured this while debugging a GPU hang with a simple CTS test. This
is to make sure data written by the CP are coherent on the CPU.

This also explains spurious GPU hang reports generated for Hitman 3
that made no sense without it. Now it's clear that this game hangs
after a DRAW_INDEX_INDIRECT packet.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17183>
2022-06-27 08:36:49 +00:00
Samuel Pitoiset db7890637e radv: disable small primitive culling for user sample locations
The driver can't assume sample positions at (0.5, 0.5) when user
sample locations are used.

This doesn't fix anything in practice because NGGC is only enabled by
default on GFX10.3 and that extension is currently disabled on GFX10+,
but I would like to expose it at some point.

This fixes dEQP-VK.pipeline.*.sample_locations_ext.verify_location.*
(when the extension is enabled locally).

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17228>
2022-06-27 08:10:08 +00:00
Ella Stanforth f392b6c1ad v3dv: Implement VK_KHR_performance_query
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14061>
2022-06-27 07:34:16 +00:00
Qiang Yu 04b15f88e7 radeonsi: replace llvm gs input handle with nir lowering
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>
2022-06-27 11:32:50 +08:00
Qiang Yu 36197b8dc0 ac/llvm: get back nir_intrinsic_load_gs_vertex_offset_amd
Will be used by radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>
2022-06-27 11:32:46 +08:00
Qiang Yu e9f1f115fa ac/nir: add triangle_strip_adjacency_fix to gs input lower
From radeonsi.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>
2022-06-27 11:32:43 +08:00
Qiang Yu f8ddee90ca radeonsi: replace llvm es output with nir lowering
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>
2022-06-27 11:32:38 +08:00
Qiang Yu 109eb378e5 ac/nir: change es output lower param to esgs_itemsize
radeonsi may add extra dword to the stride, so let's pass it
directly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>
2022-06-27 11:32:34 +08:00
Qiang Yu 8b5e8b2af7 ac/nir: remove unused param num_reserved_es_outputs from gs input lower
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>
2022-06-27 11:32:30 +08:00
Qiang Yu c66eba2072 radeonsi: set lds for gs/es to handle nir shared memory load/store
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>
2022-06-27 11:32:26 +08:00
Qiang Yu 7ddd15f6c7 ac/nir: skip gl_ViewportIndex and gl_Layer write in ES
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>
2022-06-27 11:32:21 +08:00
Qiang Yu 06d493dde2 radeonsi: implement two esgs ring nir intrinsic
nir_intrinsic_load_ring_esgs_amd
nir_intrinsic_load_ring_es2gs_offset_amd

Will be used by esgs lowering.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>
2022-06-27 11:32:15 +08:00
Qiang Yu 9fc01f6e79 ac/llvm: fix code format alignment in visit_load_local_invocation_index
Used tab instead of space.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16788>
2022-06-27 11:32:00 +08:00
Qiang Yu 7847114343 radeonsi: replace llvm tes input load with nir lowering
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 6b6aeeecbb radeonsi: set uses_vmem_load_other for nir_intrinsic_load_buffer_amd
Before lower TES load input to load buffer, mark this flag for this
intrinsic, otherwise we get corruption with GFX10 after the lowering.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 2b7e167bbd radeonsi: enable PIPE_CAP_GLSL_TESS_LEVELS_AS_INPUTS
This can remove special handling of tessfactors which also benifit
the nir lower pass which does not handle these as system value.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 7598bfd768 radeonsi: replace llvm tcs output with nir lower pass
Remove the store_tcs_outputs abi, we can use common output abi
to handle the tessfactor pass as vgpr.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu d00845faf4 ac/nir: add no_input_lds_space param to hs output lower
This is used by radeonsi to save some lds space when all LS output
is passed by register.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 8f8d06bd05 ac/llvm: handle write mask for nir_intrinsic_store_buffer_amd
tess lowering may generate buffer store with partial write mask.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu baaeca7d1a radeonsi: implement nir_intrinsic_load_tess_rel_patch_id_amd for both tcs and tes
radv will lower this intrinsic before gets to llvm, so we just need to
implement it in radeonsi.

The tes version will be used in tess lower too.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 3853dfcd44 radeonsi: implement nir_intrinsic_load_ring_tess_offchip(_offset)_amd
Used by tess lower latter.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 4ec864c057 radeonsi: preload tess offchip ring for tcs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Sigend-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu ae9b02b4d0 ac/nir: add wave_size parameter to ac_nir_lower_hs_outputs_to_mem
Used by radeonsi and radv to reflect true wave size used, not minimal size.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 18d51831a8 ac/nir: add pass_tessfactors_by_reg param to hs output lower
radeonsi won't emit tess factor in the lower pass, need to keep
the output for llvm backend to pass it as parameter. This is used
by radeonsi for an optimization to save LDS write.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 6ccb9634de ac/nir: use nir_intrinsic_load_hs_out_patch_data_offset_amd in tess lower
radeonsi load this from SGPR arg, can't use static value because TCS output
and TES input may not match (TCS output is not a key for TES) and
determined in runtime.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu fdf589321c ac/nir: add nir_intrinsic_load_hs_out_patch_data_offset_amd
Also add radv and radeonsi implementation. Will be used in tess lowering.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 2ba6d2b107 ac/nir: remove unused parameter in tes input lower
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 07e025a390 radeonsi: implement nir_intrinsic_load_tcs_num_patches_amd
Used by ac_nir_lower_tess_io_to_mem.c.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu a1763ad4b3 radeonsi: replace llvm based fixed tcs with nir
Create nir passthrough shader with explicit input/output and vertex
output count so that it can be handled by compiler same as user tcs.

The drawback is we create more si_shader_selector with different
input/output and vertex output count which was handled by compiler
backend before.

As fixed function tcs can be handled like user tcs, we don't need
the dedicated fixed_func_tcs_shader state either.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 3ab9c42b43 radeonsi: add si_create_passthrough_tcs
For replacing si_create_fixed_func_tcs.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 74350cf057 radeonsi: support multi stage shader state creation in nir shaderlib
For creating tcs passthrough shader.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu a599576654 radeonsi: use si_shader as parameter in si_get_nir_shader
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 05b829cd0c radeonsi: deserialize nir binary in si_check_blend_dst_sampler_noop
We can do this parse with original nir instead of shader key pass
applied nir in si_get_nir_shader.

This can free si_get_nir_shader to just use si_shader as parameter.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16705>
2022-06-27 02:38:21 +00:00
Qiang Yu 3aa70d92ce radv: no need to do gs_alloc_req for newer chips in ngg vs/tes
Copy from radeonsi.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17130>
2022-06-27 02:12:13 +00:00
Qiang Yu 74e596a5f0 ac/llvm: conditionally check wave id in gs sendmsg
nir lowering already call this with wave id check, no need to
check inside ac_build_sendmsg_gs_alloc_req again.

Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17130>
2022-06-27 02:12:13 +00:00
Bas Nieuwenhuizen e75f11625d radv: Deal with derefs from opaque types in function parameters.
Needs more copy propagation before nir_opt_derefs picks it up.

Note that the full general problem of opaque types stored in
intermediate variables is still open, but that seems like a whole
can of worms, and no sense to have gfxbench stay broken during the
time it takes to solve that.

Cc: mesa-stable
Gitlab: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5945
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17012>
2022-06-27 01:23:43 +00:00
Kenneth Graunke b28efd80eb iris: Update comment about 2GB dynamic state range
We tracked this down with the HW teams back in 2020 and there's now a
documented workaround.  Comments from the HW team say this applies all
the way through XeHP but we're not sure beyond that.

This is a bug that we hit but the Windows drivers didn't because Jason
decided to allocate our memory structures from the top end of the VMA
range explicitly to catch bugs like this, while Windows allocates from
zero and up, so they would need to allocate more than 2GB of dynamic
state before running into it.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4880
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17216>
2022-06-24 23:30:12 +00:00
Ryan Neph 627ba5c91b venus: support VK_KHR_copy_commands2
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17226>
2022-06-24 23:20:05 +00:00
Ryan Neph 8b81098519 venus: enable VK_EXT_image_view_min_lod
Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17227>
2022-06-24 23:09:48 +00:00
Ryan Neph f862cc070f venus: update venus-protocol with VK_EXT_image_view_min_lod
Copy in auto-generated protocol bindings.

Signed-off-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17227>
2022-06-24 23:09:48 +00:00
Jason Ekstrand 21374eb777 vulkan/render_pass: Support VkAttachmentSampleCountInfoAMD
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16953>
2022-06-24 22:37:53 +00:00
Jason Ekstrand 541819b2d6 vulkan/render_pass: Allow for mixed sample counts
RADV supports VK_AMD_mixed_attachment_samples which does exactly what it
sounds like.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16953>
2022-06-24 22:37:53 +00:00
Jason Ekstrand 7e11cdc77a vulkan/render_pass: Pass sample locations to barriers
This is required for depth/stencil images created with
VK_IMAGE_CREATE_SAMPLE_LOCATIONS_COMPATIBLE_DEPTH_BIT_EXT.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16953>
2022-06-24 22:37:53 +00:00
Jason Ekstrand 6216c59dbb vulkan/render_pass: Use a special layout for self-dependencies
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16953>
2022-06-24 22:37:53 +00:00
Jason Ekstrand f10012a2b2 anv: Use CmdBeginRendering for resumes in BeginCommandBuffer when possible
This lets us avoid the code duplication between BeginRendering and
BeginCommandBuffer and also lets us stop crawling core render pass
structs directly and instead focus on dynamic rendering concepts.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16953>
2022-06-24 22:37:53 +00:00
Jason Ekstrand 3a204d5cf3 vulkan/render_pass: Add a better helper for render pass inheritance
Instead of making drivers dive into the render pass and framebuffer
themselves, provide a helper that constructs a VkRenderingInfo for a
render pass resume that they can use instead.  This should reduce code
duplication between driver implementations of BeginRendering and
BeginCommandBuffer.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16953>
2022-06-24 22:37:53 +00:00
Mike Blumenkrantz f904b95ef0 zink: add a turnip driver workaround for EXT_depth_clip_enable
this is broken

ref #6732

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17237>
2022-06-24 19:54:44 +00:00
Mike Blumenkrantz 8f57818ce5 zink: fix-ish depth clipping without VK_EXT_depth_clip_enable
if this extension is unsupported, use the previous behavior and hope for the best

Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17237>
2022-06-24 19:54:44 +00:00
Jason Ekstrand 7c127ca018 nir/opt_memcpy: Add another case for function_temp
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> (1.5 years later)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13166>
2022-06-24 19:21:26 +00:00
Jason Ekstrand dc85065944 nir: Add an options parameter to deref_instr_has_complex_use
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> (1.5 years later)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13166>
2022-06-24 19:21:26 +00:00
Jason Ekstrand d6123460fd nir/opt_memcpy: lower copies to/from tightly packed types
v2: Add comment by Jason (Lionel)

Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> (1.5 years later)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13166>
2022-06-24 19:21:26 +00:00
Mike Blumenkrantz 82127961d2 zink: remove another zink/tu fail
fixed in f1c1b9687e

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17241>
2022-06-24 19:03:46 +00:00
Boris Brezillon e9c37e5ba8 microsoft/compiler: Fix emit_ubo_var()
get_dword_size() is misleading, its name implies it's returning
a size in dwords, but it's actually returning a size in bytes.
This led to a wrong size passed to emit_cbv(). Instead of fixing
get_dword_size(), let's inline the code in emit_ubo_var().

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17230>
2022-06-24 17:56:56 +00:00
Boris Brezillon 8e710f2cf3 dzn: Transition resource to RENDER_TARGET/DEPTH_WRITE before clears
When clear_attachment() is called, we must ensure the resource is
in the DEPTH_WRITE or RENDER_TARGET state.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17229>
2022-06-24 17:42:11 +00:00
Boris Brezillon 02002c8f12 dzn: Clamp depthBiasConstantFactor when doing the float -> int conversion
If we don't do that, we might end up with an integer overflow/underflow.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17236>
2022-06-24 16:45:12 +00:00
Boris Brezillon 9527fbe596 dzn: Fix CmdPushConstants()
The original offset value is overwritten in our first for(i: num_states)
iteration, messing up the compute push constant update if stageFlags
applies to both compute and graphics.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17233>
2022-06-24 16:13:39 +00:00
Danylo Piliaiev 5aeefe8d75 tu: Don't count 3d blits in QUERY_TYPE_PRIMITIVES_GENERATED
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17164>
2022-06-24 14:10:56 +00:00
Danylo Piliaiev 97ef19e6ce tu: Use hw binning or sysmem with QUERY_TYPE_PRIMITIVES_GENERATED
Without hw binning in gmem primitives generated query result could be
multiplied by tile count, which is not expected by OpenGL users for
GL_PRIMITIVES_GENERATED.

See https://gitlab.khronos.org/vulkan/vulkan/-/issues/3131

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17164>
2022-06-24 14:10:56 +00:00
Rhys Perry 33641b2a26 aco: cleanup force-waitcnt output
If we don't reset ctx.vm_cnt/gpr_map/etc, this will spam a lot of
s_waitcnt instructions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17207>
2022-06-24 12:44:55 +00:00
Sviatoslav Peleshko 318473eaf1 intel/blorp/gen6: Set BLEND_STATEChange only if emitting the blend state
This change is pretty straightforward: if set this field when we don't emit
the blend state, then the garbage at offset=0 will be set as a blend state,
and this will cause artifacts until the proper blend state will be given.

Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6544
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6232

Signed-off-by: Sviatoslav Peleshko <sviatoslav.peleshko@globallogic.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17132>
2022-06-24 10:06:34 +00:00
Karmjit Mahil bb93ecacd7 pvr: Rename loop iterator variable.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Rajnesh Kanwal <rajnesh.kanwal@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17206>
2022-06-24 09:15:53 +00:00
Karmjit Mahil 6e6e1e8406 pvr: Fix off by 1 error in buffer_id for ubo pds program.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Rajnesh Kanwal <rajnesh.kanwal@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17206>
2022-06-24 09:15:53 +00:00
Karmjit Mahil 4240c83960 pvr: Handle vdm degen_cull_enable.
Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Rajnesh Kanwal <rajnesh.kanwal@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17206>
2022-06-24 09:15:53 +00:00
Karmjit Mahil 7858c32550 pvr: Fix physical device limits.
This commit changes to the physical device limits which were
missed during the 1.17 transition.

Signed-off-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Reviewed-by: Rajnesh Kanwal <rajnesh.kanwal@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17206>
2022-06-24 09:15:53 +00:00
Lionel Landwerlin eac5a2fdfa anv: make apply_pipeline_layout/compute_push_layout visible to NIR debug
Useful for debug.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17209>
2022-06-24 07:12:18 +00:00
Georg Lehmann ed429af586 radv: Don't check if we need to copy immutable samplers for non push templates.
This should allow the compiler to optimize this out because it knows that
cmd_buffer is NULL.

Signed-off-by: Georg Lehmann <dadschoorse@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17166>
2022-06-24 06:32:43 +00:00
Andres Gomez 01a1af1819 radv/ci: update vkd3d-proton results for AMD's Kabini
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Roukala <martin.roukala@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17184>
2022-06-24 05:32:53 +00:00
Mike Blumenkrantz 3f86344bd6 zink: use tracked barrier info for generated barriers
this should be simpler to read through and maintain while providing
the same results as well as some possible perf and compile time improvements

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17192>
2022-06-24 01:56:28 +00:00
Mike Blumenkrantz 50e764fa50 zink: track gfx/compute descriptor barrier info
update_barriers has steadily grown more and more complex when the original
idea was for it to be a small function to handle deferred jit barriers and
simplify sync in patterns like bind_ubo -> draw -> buffer_subdata -> draw

instead, track the pending barrier info at bind time so that the stages and
access are already updated by the time draw/compute are reached

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17192>
2022-06-24 01:56:28 +00:00
Mike Blumenkrantz 74dd6e69b4 zink: fix image bind counting
these must only be incremented if the image descriptor has changed

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17192>
2022-06-24 01:56:28 +00:00
Mike Blumenkrantz 7d56912208 zink: track overall samplerview bind counts
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17192>
2022-06-24 01:56:27 +00:00
Mike Blumenkrantz 49cc3696bd zink: track ssbo bind counts
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17192>
2022-06-24 01:56:27 +00:00
Mike Blumenkrantz e38b2adb88 zink: use the bigger of the variable type and interface type for bo sizing
this avoids the scenario where the full bo size isn't accounted for because
no variable for the block has been created

cc: mesa-stable

affects:
KHR-GL33.shaders.uniform_block.random.all_per_block_buffers.3

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17217>
2022-06-24 01:21:45 +00:00
Timothy Arceri e060d98aac util: use force_gl_map_buffer_synchronized workaround with RAGE
CC: 22.1 22.0 <mesa-stable>

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/1326
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17199>
2022-06-24 00:29:24 +00:00
Timothy Arceri 5f686bfc85 util: add dri config option to disable GL_MAP_UNSYNCHRONIZED_BIT
GL_MAP_UNSYNCHRONIZED_BIT depends on the app having its threading
handled correctly. This allows us to force disable the bit when
they get it wrong.

CC: 22.1 22.0 <mesa-stable>

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17199>
2022-06-24 00:29:24 +00:00
Mike Blumenkrantz b74d3e71be lavapipe: skip post-copy pNext checking during pipeline creation for composites
these values should have all been set during pipeline compositing above,
so reapplying the values is at best, redundant, and at worst, broken

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17219>
2022-06-24 00:19:03 +00:00
Mike Blumenkrantz f8a92d28ad lavapipe: add a pipeline library assert
ensure that pipeline libraries are created with the library bit

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17219>
2022-06-24 00:19:03 +00:00
Mike Blumenkrantz 2a69aeb9c1 lavapipe: fix renderpass info handling during pipeline creation
only the viewMask parameter of VkPipelineRenderingCreateInfoKHR can
be accessed in the fragment stage, so for pipeline libraries it should
be assumed that zs attachments exist for the purpose of copying dynamic
state values, and then these dynamic states will naturally be pruned
during final pipeline construction if the attachments turn out to not
be present

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17219>
2022-06-24 00:19:03 +00:00
Mike Blumenkrantz 7018b630ed lavapipe: copy more pNexts for pipeline creation
also add some unreachable() handlers for unknown types

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17219>
2022-06-24 00:19:03 +00:00
Mike Blumenkrantz 3aaab4a232 lavapipe: zero out blend info if blend isn't enabled
this makes reading traces easier

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17219>
2022-06-24 00:19:03 +00:00
Jason Ekstrand 9be88a8464 panfrost: Use u_default_clear_buffer
[Alyssa: This is required for OpenCL.]

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16044>
2022-06-23 23:18:06 +00:00
Jason Ekstrand f32ac20862 iris: Use u_default_clear_buffer
iris uses u_default_buffer_subdata for buffer uploads via a CPU map so
clearing shouldn't be substantially worse.  We can do it with BLORP in
the future if we decide it's useful.

[Alyssa: A BLORP implementation is available at
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15931 however nobody
has taken to reviewing that solution.]

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16044>
2022-06-23 23:18:06 +00:00
Jason Ekstrand cd21d32fe4 gallium: Add a u_default_clear_buffer helper
[Alyssa: Add a default CPU implementation of pipe->clear_buffer(). This hook is
mandatory for OpenCL support. Even though this implementation isn't optimal by
any means, having a conformant default available in core will lower the barrier
of entry to OpenCL support.]

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16044>
2022-06-23 23:18:06 +00:00
Lionel Landwerlin 9b11618dfa anv: disable perf queries on non RCS engines
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17015>
2022-06-23 22:47:37 +00:00
Alyssa Rosenzweig f00ebb913a u_blitter: Remove util_blitter_copy_buffer
It is now unused. We cannot yet remove the streamout functionality in u_blitter
as r600g still uses it for clear_buffer on GPUs older than Evergreen.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17142>
2022-06-23 22:23:31 +00:00
Alyssa Rosenzweig 21f5c6ea87 r600g: Remove streamout-based buffer copy path
r600g is the only user of util_blitter_copy_buffer in tree, which implements
buffer copies with streamout. This path for r600g was added in 8ac9801669
("r600g: accelerate buffer copying"), a commit from 2012. At that point there
was no DMA path for buffer copies. Since then, a DMA path has been added,
conditional only on the kernel version -- not the hardware. It appears the
required kernel support has been mainline for at least 4 years now. Mesa 22.2
doesn't need to provide optimal performance on an old kernel -- for performance,
a DMA-capable kernel should be used, and for compatability, the CPU fallback
(used for unaligned buffers as it is) is still available. Remove the streamout
path "in the middle" that appears ~unused today.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17142>
2022-06-23 22:23:31 +00:00
Enrico Galli f367c55573 microsoft/spirv_to_dxil: Fix discard semantics
Unlike in nir, discard is not a super return in DXIL. Therefore, we
will lower discard and terminate to demote + return.

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17179>
2022-06-23 22:04:32 +00:00
Ian Romanick 6689fa2ab4 nir/range_analysis: Teach range analysis about fdot opcodes
This really, really helps on platforms where fabs() isn't free.  A great
many shaders use a * frsq(fabs(fdot(a, a))) to normalize a vector.
Since the result of the fdot must be non-negative, the fabs can be
eliminated by an existing algebraic rule.

shader-db results:

r300 (run on R420 - X800XL)
total instructions in shared programs: 1369807 -> 1368550 (-0.09%)
instructions in affected programs: 59986 -> 58729 (-2.10%)
helped: 609
HURT: 0

total vinst in shared programs: 512899 -> 512861 (<.01%)
vinst in affected programs: 1522 -> 1484 (-2.50%)
helped: 36
HURT: 0

total sinst in shared programs: 260690 -> 260570 (-0.05%)
sinst in affected programs: 1419 -> 1299 (-8.46%)
helped: 120
HURT: 0

total consts in shared programs: 957295 -> 957230 (<.01%)
consts in affected programs: 849 -> 784 (-7.66%)
helped: 65
HURT: 0

LOST:   0
GAINED: 3

The 3 gained shaders are all vertex shaders from XCom: Enemy Unknown.
I'm guessing that game is never going to run on my X800XL. :)

i915
total instructions in shared programs: 791121 -> 780843 (-1.30%)
instructions in affected programs: 220170 -> 209892 (-4.67%)
helped: 2085
HURT: 0

total temps in shared programs: 47765 -> 47766 (<.01%)
temps in affected programs: 9 -> 10 (11.11%)
helped: 0
HURT: 1

total const in shared programs: 93048 -> 92983 (-0.07%)
const in affected programs: 784 -> 719 (-8.29%)
helped: 65
HURT: 0

LOST:   0
GAINED: 36

Haswell, Ivy Bridge, and Sandy Bridge had similar results. (Haswell shown)
total instructions in shared programs: 16702250 -> 16697908 (-0.03%)
instructions in affected programs: 119277 -> 114935 (-3.64%)
helped: 1065
HURT: 0
helped stats (abs) min: 1 max: 20 x̄: 4.08 x̃: 4
helped stats (rel) min: 0.48% max: 10.17% x̄: 3.66% x̃: 3.94%
95% mean confidence interval for instructions value: -4.26 -3.89
95% mean confidence interval for instructions %-change: -3.76% -3.56%
Instructions are helped.

total cycles in shared programs: 880772068 -> 880734134 (<.01%)
cycles in affected programs: 2134456 -> 2096522 (-1.78%)
helped: 941
HURT: 324
helped stats (abs) min: 2 max: 2180 x̄: 123.06 x̃: 44
helped stats (rel) min: 0.04% max: 49.96% x̄: 7.08% x̃: 3.81%
HURT stats (abs)   min: 2 max: 2098 x̄: 240.33 x̃: 35
HURT stats (rel)   min: 0.04% max: 77.07% x̄: 12.34% x̃: 3.00%
95% mean confidence interval for cycles value: -47.93 -12.04
95% mean confidence interval for cycles %-change: -2.87% -1.34%
Cycles are helped.

No shader-db changes on any other Intel platform.

Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17181>
2022-06-23 18:46:27 +00:00
Sebastian Keller f50fe9b0b6 egl/wayland: Don't try to access modifiers u_vector as dynarray
The modifiers are u_vectors, but the code was trying to access them
as dynarrays. This resulted in a wrong number of modifiers, which then
later on would also lead to invalid reads used as modifiers.

In the case of the iris driver, a wrongly read number of modifiers > 0
would also trigger an error message.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6643
Fixes: b5848b2dac ("egl/wayland: use surface dma-buf feedback to allocate surface buffers")
Reviewed-by: Leandro Ribeiro <leandro.ribeiro@collabora.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17180>
2022-06-23 16:12:15 +00:00
Jason Ekstrand 8ce7faab47 vulkan: Add a vk_pipeline_shader_stage_to_nir helper
This is similar to vk_shader_module_to_nir only it takes a
VkPipelineShaderStageCreateInfo and handles
VK_KHR_graphics_pipeline_library semantics for when a
VkShaderModuleCreateInfo is provided instead of an actual module.

Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17196>
2022-06-23 15:41:00 +00:00
Jason Ekstrand 288c1c29fb vulkan/nir: Make spirv_data const in vk_spirv_to_nir
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17196>
2022-06-23 15:41:00 +00:00
Matt Coster 29a7d924c7 pvr: csbgen: Make all generated enums unambiguous
This change involves two enums:
 * rogue_texstate.xml: All COMPRESSED_* members of FORMAT are moved
   to FORMAT_COMPRESSED (without the prefix). A second field is added
   to IMAGE_WORD0 (texformat_compressed) which overlaps with the
   original (texformat), and
 * rogue_pbestate.xml: REG_WORD0_LINESTRIDE was not a real enum; it's
   removed entirely. It only has value when feature
   pbe_stride_align_1pixel is present, so a FIXME comment was added to
   this effect.

Signed-off-by: Matt Coster <matt.coster@imgtec.com>
Reviewed-by: Karmjit Mahil <Karmjit.Mahil@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17204>
2022-06-23 15:21:17 +00:00
Pavel Ondračka 6cc0a3ed44 r300: only run merge_movs pass on R500
This pass currently generates some swizzles that the R300 and R400
hardware can't handle, make it R500 for now.

Fixes: 6c2959c0

Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17150>
2022-06-23 12:59:11 +00:00
Vinson Lee b1df00cb79 tu: Check dereferenced value of rop_reads_dst.
Fix defect reported by Coverity Scan.

Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking rop_reads_dst suggests that it may be
null, but it has already been dereferenced on all paths leading to the
check.

Fixes: 94be0dd0b8 ("tu: Implement extendedDynamicState2LogicOp")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17099>
2022-06-23 11:11:56 +00:00
Connor Abbott 7d706af76b ir3: Fix vectorizer condition for SSBOs
SSBO access works very differently from UBO access. Straddling
loads/stores isn't an issue, loads/stores instead must be aligned to the
element size and can have up to 4 components.

We support 16-bit access with SSBOs on a650+, and sometimes the
vectorizer tries to create a misaligned 32-bit access when combining
32-bit and 16-bit accesses. The UBO-focused logic didn't reject this,
which is now fixed. This fixes a number of VK-CTS regressions on a650+.

Fixes: bf49d4a084 ("freedreno/ir3: Enable load/store vectorization for SSBO access, too.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17040>
2022-06-23 10:46:31 +00:00
Rhys Perry 6fc2622abd aco: don't skip VS->TCS barrier if TCS output vertices doesn't match input
TCS invocations correspond to output patch vertices, not input. If they
differ, TCS invocations can be in a different subgroup than VS invocations
of the input patch.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6564
Fixes: 152092b8ea ("aco: skip s_barrier if TCS patches are within subgroup")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17174>
2022-06-23 10:08:02 +00:00
Yonggang Luo d4ce845a8d meson: Enable wgl tests on mingw
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
2022-06-23 09:27:06 +00:00
Yonggang Luo cb53094ac1 d3d12: Turn d3d12_format.h to include d3d12_common.h
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
2022-06-23 09:27:06 +00:00
Yonggang Luo d52f280bd7 dzn: Fixes incompatible pointer type error
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
2022-06-23 09:27:06 +00:00
Yonggang Luo a387c9284a microsoft/clc: Disable clc_compiler_test on non-windows platform
The test can compile, but can not pass, so compile it but not running it

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
2022-06-23 09:27:06 +00:00
Yonggang Luo e1e94f8c81 microsoft/clc: Fixes narrowing error in clc_compiler_test.cpp with mingw/gcc
errors:
../../src/microsoft/clc/clc_compiler_test.cpp:563:19: error: narrowing conversion of '268435457' from 'int' to 'uint16_t' {aka 'short unsigned int'} [-Wnarrowing]
  563 |       0x00000000, 0x10000001, 0x20000020, 0x30000300,
      |                   ^~~~~~~~~~
../../src/microsoft/clc/clc_compiler_test.cpp:563:31: error: narrowing conversion of '536870944' from 'int' to 'uint16_t' {aka 'short unsigned int'} [-Wnarrowing]
  563 |       0x00000000, 0x10000001, 0x20000020, 0x30000300,
      |                               ^~~~~~~~~~
../../src/microsoft/clc/clc_compiler_test.cpp:563:43: error: narrowing conversion of '805307136' from 'int' to 'uint16_t' {aka 'short unsigned int'} [-Wnarrowing]
  563 |       0x00000000, 0x10000001, 0x20000020, 0x30000300,

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
2022-06-23 09:27:06 +00:00
Yonggang Luo 7cb78a27d8 d3d12: Fixes compiling error in d3d12/wgl/d3d12_wgl_framebuffer.cpp with gcc
error message:
```
../../src/gallium/winsys/d3d12/wgl/d3d12_wgl_framebuffer.cpp:231:42: error: no matching function for call to 'operator new(sizetype, d3d12_wgl_framebuffer*&)'
  231 |    new (fb) struct d3d12_wgl_framebuffer();
      |                                          ^
<built-in>: note: candidate: 'void* operator new(long long unsigned int)'

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
2022-06-23 09:27:06 +00:00
Yonggang Luo 05097d1f6c d3d12: Convert #include <Windows.h> to #include <windows.h> for mingw on linux
Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
2022-06-23 09:27:06 +00:00
Yonggang Luo 6b181fe1b2 d3d12: Use static_cast instead of dynamic_cast in d3d12_video_enc_h264.cpp
Because we may compile mesa with both rtti=enabled and rtti=disabled because of LLVM
Fixes errors:
../../src/gallium/drivers/d3d12/d3d12_video_enc_h264.cpp:777:7: error: 'dynamic_cast' not permitted with '-fno-rtti'
  777 |       dynamic_cast<d3d12_video_bitstream_builder_h264 *>(pD3D12Enc->m_upBitstreamBuilder.get());
      |       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Signed-off-by: Yonggang Luo <luoyonggang@gmail.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16084>
2022-06-23 09:27:06 +00:00
Jason Ekstrand deb36dc6c2 turnip: Use the new border color helpers
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15359>
2022-06-23 00:01:41 +00:00
Jason Ekstrand 498a8e77dd lavapipe: Use the new border color helper
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15359>
2022-06-23 00:01:41 +00:00
Jason Ekstrand b8882718b7 panvk: Use the new border color helpers
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15359>
2022-06-23 00:01:41 +00:00
Jason Ekstrand 981cf8a41d vulkan: Add some border color helpers
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15359>
2022-06-23 00:01:41 +00:00
Zhang, Jianxun bc42bbff4c iris: Wa_14016820455 for GFX_VERx10 == 12.5
Reprogram SF CLIP viewport pointer by not skipping its
dirty flag bit.

Many thanks to Lin, Shuicheng <shuicheng.lin@intel.com>,
Jerez Plata, Francisco <francisco.jerez.plata@intel.com>,
Graunke, Kenneth W <kenneth.w.graunke@intel.com>,
and others for their great help.

Signed-off-by: Zhang, Jianxun <jianxun.zhang@intel.com>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17171>
2022-06-22 22:22:50 +00:00
Lionel Landwerlin 5d05ffa465 anv: limit RT writes to number of color outputs
Not doing so crates skews occlusion queries. Fixes Zink's piglit
occlusion_query tests.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: a4f502de32 ("anv: fix VK_DYNAMIC_STATE_COLOR_WRITE_ENABLE_EXT state")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6205
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15740>
2022-06-22 21:45:52 +00:00
Alyssa Rosenzweig 76981e5615 agx: Handle loop { if { loop { .. } } }
We need to push loop nesting to handle this correctly -- at the end of
the innermost loop, the correct nesting is 1 (from the if), not 0.

Fixes assertion failure in

  dEQP-GLES2.functional.shaders.struct.local.dynamic_loop_nested_struct_array_fragment,UnexpectedPass
  dEQP-GLES2.functional.shaders.struct.local.dynamic_loop_nested_struct_array_vertex,UnexpectedPass
  dEQP-GLES2.functional.shaders.struct.uniform.dynamic_loop_nested_struct_array_fragment,UnexpectedPass
  dEQP-GLES2.functional.shaders.struct.uniform.dynamic_loop_nested_struct_array_vertex,UnexpectedPass

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17128>
2022-06-22 21:23:50 +00:00
Emma Anholt 13bf36588d ci/bare-metal: Consolidate needs declarations in .baremetal-test-*.
We had it set up for arm64 asan already, do it for everyone else too.  In
cleaning up the duplication, this fixes a pasteo in rpi3 which had the
"artifacts: false" on the wrong job, causing it to do a slow download of
the mesa build from gitlab.

Doing this required also moving the ".use-debian/arm_test" in as well, so
that its "needs:" didn't overwrite ours if it appeared after us in the
consumer's "extends:"

Should save about 20 seconds on rpi3 jobs.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17146>
2022-06-22 20:59:54 +00:00
Mike Blumenkrantz e13f04fcf0 zink: flag dmabufs for foreign queue transition on flush_resource call
this is needed by ext_external_objects

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15810>
2022-06-22 20:42:02 +00:00
Mike Blumenkrantz 32c34e93aa zink: add flag to indicate if a resource is a dmabuf
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15810>
2022-06-22 20:42:02 +00:00
Emma Anholt 69cad6dcb7 ci/freedreno: Turn a530 back on by default and update expectations.
I think it should be fixed since I redid how we manage serial around
restarts.  Haven't seen a fail in the manual runs I've done.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17004>
2022-06-22 20:07:36 +00:00
Emma Anholt 4e3c51cbd8 freedreno/a5xx: Set the buffer bit appropriately in XS_CTRL_REG0.
This seems to be how the bit gets used, from grepping my blob traces.
Hopefully this helps stabilize some stuff.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17004>
2022-06-22 20:07:36 +00:00
Emma Anholt 6cf2b24eaf freedreno/ir3: Disable image/ssbo 16-bit conversion folding pre-a6xx.
I don't see it in blob dumps, and the reordered args tripped up validation.

Fixes: 49dc60efa1 ("freedreno/ir3: Fold 16-bit conversions into image load/store src/dsts.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17004>
2022-06-22 20:07:36 +00:00
Ian Romanick fd1f2d3b5a nir: Add and use algebraic property "is selection"
There are several places that should have supported the various sized
versions of bcsel and the various nir_op_[fi]csel_* opcodes.  Rather
than enumerate the whole list, add a property.

v2: Make the comment for NIR_OP_IS_SELECTION more descriptive.
Suggested by Jason.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>
2022-06-22 19:26:59 +00:00
Ian Romanick a2a2fbc510 nir/algebraic: Fix NaN-unsafe fcsel patterns
For example, the proof for this pattern

    (('bcsel', ('flt', 'a@32', 0), 'b@32', 'c@32'), ('fcsel_ge', a, c, b)),

would be

    bcsel(a < 0, b, c)
    bcsel(!(a < 0), c, b)
    bcsel(a >= 0, c, b)
    fcsel_ge(a, c, b)

However, !(a < 0) => (a >= 0) is well known to produce different
results if `a` is NaN.

Instead of that replacement, use this replacement:

    bcsel(a < 0, b, c)
    bcsel(-0 < -a, b, c)
    bcsel(0 < -a, b, c)
    fcsel_gt(-a, b, c)

This is NaN-safe and exact.

Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Fixes: 0f5b3c37c5 ("nir: Add opcodes for fused comp + csel and optimizations")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>
2022-06-22 19:26:59 +00:00
Ian Romanick ccd18ec4f3 nir: i32csel opcodes should compare with integer zero
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Noticed-by: Georg Lehmann <dadschoorse@gmail.com>
Fixes: 0f5b3c37c5 ("nir: Add opcodes for fused comp + csel and optimizations")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17048>
2022-06-22 19:26:59 +00:00
Connor Abbott d455838081 tu: Fix linemode for tessellation with isolines
Fixes: 542211676c ("turnip: enable VK_EXT_line_rasterization")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17190>
2022-06-22 17:57:53 +00:00
Alyssa Rosenzweig e812e8892a v3d: Drop workaround for u_blitter bug
This doesn't seem to be necessary.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17113>
2022-06-22 17:07:10 +00:00
Danylo Piliaiev f1c1b9687e tu: Do not expose storage image/buffer features for PACK16 formats
We don't support storing into them.

Fixes GL CTS tests running through ZINK:
 KHR-GL46.packed_pixels.pbo_rectangle.*

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17168>
2022-06-22 14:39:47 +00:00
Emma Anholt 4309e09d6f vc4: Propagate txf_ms's dest_type to the lowered txf.
This was missing, and the added validation caught it.

Fixes: 708c47e663 ("nir: Validate nir_tex_instr::dest_type bitsize")
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17172>
2022-06-22 07:10:18 -07:00
Emma Anholt 1de87497ba ci/vc4: Turn on deqp-egl testing by default.
Now that we have one less job, let's flip this on.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17172>
2022-06-22 07:10:14 -07:00
Emma Anholt e9fad0b9aa ci/vc4: Merge quick_shader in with deqp-gles
All 4 jobs had a total of about 26 minutes of runner time, so squish them
onto 3 runners and use gbm for the .shader_tests to avoid X overhead and
hopefully succeed with full concurrency.

Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17172>
2022-06-22 07:09:53 -07:00
Mike Blumenkrantz 872a1ae69e zink: ci updates
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
2022-06-22 13:27:29 +00:00
Mike Blumenkrantz 90586f812c mesa: explicitly disallow multiple pointsize exports from generating
for the fixedfunc vertex case this is important since the fixedfunc shader
may have already added an (attenuated) pointsize

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
2022-06-22 13:27:29 +00:00
Mike Blumenkrantz 096c5aa34a mesa: enforce pointsize exports if pointsize is being clamped
min/max pointsize clamping affects the value that must be used,
meaning that it may not be 1.0

in the case where clamping changes the value from 1.0, ensure the shader
export path is used if attenuation isn't enabled

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
2022-06-22 13:27:29 +00:00
Mike Blumenkrantz 3e2c132eb8 mesa: skip pointsize exports if pointsize attenuation is enabled
attenuation has its own method of exporting pointsize in fixedfunc shaders,
so ensure the attenuated size isn't overwritten

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
2022-06-22 13:27:29 +00:00
Mike Blumenkrantz de732cf61b mesa: rename PointSizeIsOne -> PointSizeIsSet
this will better convey the meaning of the value

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
2022-06-22 13:27:29 +00:00
Mike Blumenkrantz b2155a044d mesa: break out PointSizeIsOne setting to util function
Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
2022-06-22 13:27:29 +00:00
Mike Blumenkrantz 4830cc77cb nir/lower_point_size: apply point size clamping
point size min/max values are provided through the state vars, so ensure
these are always applied in order to respect ARB_point_parameters

cc: mesa-stable

Acked-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17145>
2022-06-22 13:27:29 +00:00
Italo Nicola 42a1264951 virgl: overpropagate precise flags
As it turns out, MOVs weren't the only instructions that blocked precise
flags propagation in the transition to nir-to-tgsi.
This commit fixes some rendering regressions caused by a4a34cd3.

Fixes: a4a34cd3

Signed-off-by: Italo Nicola <italonicola@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collanora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17144>
2022-06-22 12:58:58 +00:00
Jason Volk e1488d9374 radeon: Support shared memory user pointers.
The RADEON_GEM_USERPTR_ANONONLY flag is hardcoded here which excludes
shared memory pages. DRM is actually capable of supporting shared file-
backed memory, but only if it's read-only. This mutability intent has to
be conveyed through the stack, so a flags argument is added to the winsys
regime to pass RADEON_FLAG_READ_ONLY.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16115>
2022-06-22 12:23:02 +00:00
Marcin Ślusarz f871aa10a1 intel/compiler: assert that base is 0 for [load|store]_shared intrins
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17143>
2022-06-22 10:32:13 +00:00
Timur Kristóf e5970fe22a nir/lower_task_shader: don't use base index for shared memory intrinsics
Intel backend doesn't handle them very well.

Fixes: 8aff8d3dd4 ("nir: Add common task shader lowering to make the backend's job easier.")
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17143>
2022-06-22 10:32:13 +00:00
Marcin Ślusarz 49b8fffeed nir/lower_task_shader: insert barrier before/after shared memory read/write
Fixes: 8aff8d3dd4 ("nir: Add common task shader lowering to make the backend's job easier.")
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17143>
2022-06-22 10:32:13 +00:00
Connor Abbott c601ba332b ir3/sched: Fix could_sched() determination
This needs to be accurate so that when we split and then schedule a new
a0.x/a1.x/p0.x write we will eventually make progress. It wasn't taking
the kill_path into account which could create an infinite loop as we
keep scheduling writes whose uses are blocked because they are memory
instructions not on the kill_path.

Closes: #6413
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16635>
2022-06-22 10:09:13 +00:00
Danylo Piliaiev a8671b2182 meson/tu: Don't compile libdrm paths if KGSL is selected
Even if there is libdrm we shouldn't use it if KGSL is selected.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17173>
2022-06-22 11:52:36 +03:00
Danylo Piliaiev 6ad7be1b36 meson/pps: Check if libdrm exists to compile pps
For Turnip with KGSL we may have perffeto enabled but we don't
have libdrm.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17173>
2022-06-22 11:52:36 +03:00
Samuel Pitoiset ad3d6d9c6e radv/llvm: always emit a null export even if the FS doesn't discard
Even with a noop FS, the color blend state can still be non-zero, and
then SPI color related registers won't be 0 and this would hang.

Fixes: bdf3797aeb ("ac,radeonsi: don't export null from PS if it has no effect on gfx10+")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17169>
2022-06-22 08:31:30 +02:00
Pavel Asyutchenko 17645cb29c llvmpipe: enable PIPE_CAP_FBFETCH_ZS
Support for it was added in previous commits.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko ccaa7920ef llvmpipe: implement FB fetch for depth/stencil
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko 0ba3e797ee llvmpipe: simplify early/late zs tests selection
This does not change selection logic.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko 443ef18f0c llvmpipe: enable per-sample shading when FB fetch is used
This matches specifications of both color and ZS fetch extensions.

Cc: mesa-stable
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko 8788b17596 nir_to_tgsi: Don't count ZS fbfetch vars as outputs
Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko 959b748038 glsl: add language support for GL_ARM_shader_framebuffer_fetch_depth_stencil
This extension adds built-in variables gl_LastFragDepthARM and gl_LastFragStencilARM
which can be implemented almost the same as gl_LastFragData from color fetch extension.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Pavel Asyutchenko 41f22a1823 gallium: add PIPE_CAP_FBFETCH_ZS and expose extension
st/mesa will expose GL_ARM_shader_framebuffer_fetch_depth_stencil
if this new capability is supported by the driver.

Signed-off-by: Pavel Asyutchenko <sventeam@yandex.ru>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13979>
2022-06-22 04:32:44 +00:00
Dave Airlie 68e8940114 glx/drisw: use xcb instead of X to query connection
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
2022-06-22 03:28:21 +00:00
Dave Airlie d3e723fb77 wsi/x11: add xcb_put_image support for larger transfers.
This was noticed as a problem in the EGL code, just fixup wsi.

Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
2022-06-22 03:28:21 +00:00
Dave Airlie c5dbb1139c egl/x11: add missing put_image cookie cleanups
These might not be required but be consistent with the wsi code.

Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
2022-06-22 03:28:21 +00:00
Dave Airlie e6082ac62e egl/x11: split large put image requests to avoid server destroy
wezterm in fullscreen 4k was exceeding the xcb max request size
on the put image with llvmpipe. This fixes it to send sub-images,
the Xlib put image used in glx does this internally, but not
the xcb one, so just do it in sections here.

Cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17155>
2022-06-22 03:28:21 +00:00
Mike Blumenkrantz e8fc5cca90 zink: fix dual_src_blend driconf workaround
not sure when this broke but it broke

cc: mesa-stable

Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17156>
2022-06-22 03:14:18 +00:00
Mike Blumenkrantz ea005c9e04 glx/drisw: invalidate drawables upon binding context if flush extension exists
this forces surface resize as expected

cc: mesa-stable

fixes #6706

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17147>
2022-06-22 02:18:37 +00:00
Mike Blumenkrantz 23b63e536e glx/drisw: store the flush extension to the screen
cc: mesa-stable

Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17147>
2022-06-22 02:18:37 +00:00
Jason Ekstrand 64d074879b vulkan/wsi: Use HAVE_LIBDRM to detect DRM instead of !_WIN32
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17170>
2022-06-22 01:15:20 +00:00
Jordan Justen a7127fbc4c intel/tools: Print memory info in intel_dev_info
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Jordan Justen eaf2a35a76 iris/bufmgr: Use memory info from devinfo
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Jordan Justen 1505f94397 anv: Use memory info from devinfo
Rework:
 * Jordan: Drop regions.valid (Lionel implemented a fallback)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Lionel Landwerlin 4289c9ec13 intel/dev: add a fallback when memory regions are not available
We have this in Anv and it could be reused in Iris for integrated
memory system.

Rework:
 * Jordan: Drop regions.valid (Lionel implemented a fallback)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Lionel Landwerlin 4e727297e8 intel/dev: add a helper to update memory info
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Jordan Justen 4aecfbf0f4 intel/dev: Add devinfo::mem to store i915 regions information
Reworks:
 * Lionel: Change check on memory region valid to vram size
 * Jordan: Drop regions.valid (Lionel implemented a fallback)
 * Jordan: Rename devinfo::regions to devinfo::mem.
 * Jordan: Add devinfo::mem::use_class_instance
 * Add mesa_logw for lmem requiring regions. (s-b Lionel)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17075>
2022-06-22 00:30:49 +00:00
Alyssa Rosenzweig 1222c86e34 panfrost: Bump ESSL_FEATURE_LEVEL on Valhall
This advertises ARB_gpu_shader5 on Valhall, which should be working now. On the
GLES3.1 side, this notably adds support for sample variables and dynamic offsets
for texture gathers, both of which should now be working.

No shader-db changes.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 74460a5d75 panfrost: Enable CAP_INDIRECT_TEMP_ADDR on Valhall
For parity with Bifrost. Apparently this pattern is sufficiently obscure that
the shader-db results on Mali-G57 are mostly noise.

total instructions in shared programs: 2675116 -> 2674820 (-0.01%)
instructions in affected programs: 4336 -> 4040 (-6.83%)
helped: 8
HURT: 1
helped stats (abs) min: 1.0 max: 52.0 x̄: 37.88 x̃: 49
helped stats (rel) min: 0.46% max: 8.20% x̄: 5.97% x̃: 7.56%
HURT stats (abs)   min: 7.0 max: 7.0 x̄: 7.00 x̃: 7
HURT stats (rel)   min: 5.98% max: 5.98% x̄: 5.98% x̃: 5.98%
95% mean confidence interval for instructions value: -52.90 -12.88
95% mean confidence interval for instructions %-change: -8.48% -0.81%
Instructions are helped.

total cvt in shared programs: 14127.08 -> 14126.53 (<.01%)
cvt in affected programs: 33.84 -> 33.30 (-1.62%)
helped: 10
HURT: 1
helped stats (abs) min: 0.015625 max: 0.125 x̄: 0.06 x̃: 0
helped stats (rel) min: 0.71% max: 2.93% x̄: 1.76% x̃: 1.78%
HURT stats (abs)   min: 0.09375 max: 0.09375 x̄: 0.09 x̃: 0
HURT stats (rel)   min: 7.89% max: 7.89% x̄: 7.89% x̃: 7.89%
95% mean confidence interval for cvt value: -0.09 -0.01
95% mean confidence interval for cvt %-change: -2.89% 1.13%
Inconclusive result (%-change mean confidence interval includes 0).

total sfu in shared programs: 7572 -> 7555.69 (-0.22%)
sfu in affected programs: 37.19 -> 20.88 (-43.87%)
helped: 6
HURT: 3
helped stats (abs) min: 2.75 max: 2.75 x̄: 2.75 x̃: 2
helped stats (rel) min: 47.31% max: 48.89% x̄: 48.63% x̃: 48.89%
HURT stats (abs)   min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0
HURT stats (rel)   min: 5.56% max: 6.25% x̄: 5.79% x̃: 5.56%
95% mean confidence interval for sfu value: -2.89 -0.73
95% mean confidence interval for sfu %-change: -51.41% -9.57%
Sfu are helped.

total quadwords in shared programs: 1450040 -> 1449896 (<.01%)
quadwords in affected programs: 1992 -> 1848 (-7.23%)
helped: 6
HURT: 0
helped stats (abs) min: 24.0 max: 24.0 x̄: 24.00 x̃: 24
helped stats (rel) min: 6.82% max: 7.50% x̄: 7.24% x̃: 7.32%
95% mean confidence interval for quadwords value: -24.00 -24.00
95% mean confidence interval for quadwords %-change: -7.48% -6.99%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 7d84bb00dc panfrost: Enable more FP16 caps on Valhall
This brings the FP16 capabilities of Valhall to parity with Bifrost.
Supporting FP16 constant buffers in particular reduces ALU in a ton of GLES
shaders, so that's a nice win. FP16 derivatives get vectorized which is a big
win where that applies, but they are considerably less common.

The lost shaders are from enabling PIPE_SHADER_CAP_FP16_CONST_BUFFERS (these
shaders compile on Midgard but not on Bifrost). The shaders in question declare
the same uniform in linked vertex and fragment shaders with different
precisions. This is contrary to the GLSL ES specification, which states
precisions must match for default uniforms of linked shaders. All the lost
shaders are in 8 Ball Pool and Hill Climb Racing. As those are proprietary
games, if that becomes a problem in the future, drirc is the solution.

total instructions in shared programs: 2697897 -> 2674595 (-0.86%)
instructions in affected programs: 1019922 -> 996620 (-2.28%)
helped: 4838
HURT: 2599
helped stats (abs) min: 1.0 max: 52.0 x̄: 7.13 x̃: 5
helped stats (rel) min: 0.16% max: 46.51% x̄: 8.04% x̃: 5.33%
HURT stats (abs)   min: 1.0 max: 36.0 x̄: 4.30 x̃: 3
HURT stats (rel)   min: 0.17% max: 133.33% x̄: 10.53% x̃: 3.85%
95% mean confidence interval for instructions value: -3.32 -2.95
95% mean confidence interval for instructions %-change: -1.89% -1.22%
Instructions are helped.

total cycles in shared programs: 141764.61 -> 140602.88 (-0.82%)
cycles in affected programs: 5728.22 -> 4566.48 (-20.28%)
helped: 665
HURT: 89
helped stats (abs) min: 0.015625 max: 15.0 x̄: 1.75 x̃: 0
helped stats (rel) min: 0.30% max: 61.54% x̄: 11.17% x̃: 4.62%
HURT stats (abs)   min: 0.015625 max: 0.265625 x̄: 0.04 x̃: 0
HURT stats (rel)   min: 0.30% max: 66.67% x̄: 6.77% x̃: 1.94%
95% mean confidence interval for cycles value: -1.77 -1.31
95% mean confidence interval for cycles %-change: -10.11% -7.99%
Cycles are helped.

total fma in shared programs: 22577.56 -> 22575.91 (<.01%)
fma in affected programs: 2422.78 -> 2421.12 (-0.07%)
helped: 533
HURT: 653
helped stats (abs) min: 0.015625 max: 0.0625 x̄: 0.03 x̃: 0
helped stats (rel) min: 0.30% max: 50.00% x̄: 8.25% x̃: 1.35%
HURT stats (abs)   min: 0.015625 max: 0.125 x̄: 0.03 x̃: 0
HURT stats (rel)   min: 0.19% max: 100.00% x̄: 4.53% x̃: 2.08%
95% mean confidence interval for fma value: -0.00 0.00
95% mean confidence interval for fma %-change: -1.98% -0.44%
Inconclusive result (value mean confidence interval includes 0).

total cvt in shared programs: 14460.95 -> 14122.50 (-2.34%)
cvt in affected programs: 6159.02 -> 5820.56 (-5.50%)
helped: 4827
HURT: 2577
helped stats (abs) min: 0.015625 max: 0.796875 x̄: 0.11 x̃: 0
helped stats (rel) min: 0.20% max: 81.82% x̄: 17.78% x̃: 12.90%
HURT stats (abs)   min: 0.015625 max: 0.546875 x̄: 0.07 x̃: 0
HURT stats (rel)   min: 0.00% max: 600.00% x̄: 43.66% x̃: 13.04%
95% mean confidence interval for cvt value: -0.05 -0.04
95% mean confidence interval for cvt %-change: 2.28% 4.93%
Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree).

total sfu in shared programs: 7593.56 -> 7571.06 (-0.30%)
sfu in affected programs: 357.19 -> 334.69 (-6.30%)
helped: 149
HURT: 1
helped stats (abs) min: 0.0625 max: 0.25 x̄: 0.15 x̃: 0
helped stats (rel) min: 5.26% max: 36.36% x̄: 6.79% x̃: 5.56%
HURT stats (abs)   min: 0.0625 max: 0.0625 x̄: 0.06 x̃: 0
HURT stats (rel)   min: 3.57% max: 3.57% x̄: 3.57% x̃: 3.57%
95% mean confidence interval for sfu value: -0.16 -0.14
95% mean confidence interval for sfu %-change: -7.51% -5.93%
Sfu are helped.

total v in shared programs: 8722.62 -> 8722.31 (<.01%)
v in affected programs: 1.62 -> 1.31 (-19.23%)
helped: 2
HURT: 0

total ls in shared programs: 129666 -> 128494 (-0.90%)
ls in affected programs: 4163 -> 2991 (-28.15%)
helped: 192
HURT: 0
helped stats (abs) min: 1.0 max: 15.0 x̄: 6.10 x̃: 5
helped stats (rel) min: 4.35% max: 75.00% x̄: 30.23% x̃: 26.32%
95% mean confidence interval for ls value: -6.67 -5.54
95% mean confidence interval for ls %-change: -32.67% -27.79%
Ls are helped.

total quadwords in shared programs: 1461496 -> 1449768 (-0.80%)
quadwords in affected programs: 273592 -> 261864 (-4.29%)
helped: 1992
HURT: 687
helped stats (abs) min: 8.0 max: 24.0 x̄: 8.76 x̃: 8
helped stats (rel) min: 1.43% max: 50.00% x̄: 16.30% x̃: 11.11%
HURT stats (abs)   min: 8.0 max: 16.0 x̄: 8.31 x̃: 8
HURT stats (rel)   min: 1.92% max: 100.00% x̄: 36.39% x̃: 25.00%
95% mean confidence interval for quadwords value: -4.67 -4.08
95% mean confidence interval for quadwords %-change: -3.95% -1.62%
Quadwords are helped.

total threads in shared programs: 53496 -> 53551 (0.10%)
threads in affected programs: 112 -> 167 (49.11%)
helped: 74
HURT: 19
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00%
95% mean confidence interval for threads value: 0.42 0.76
95% mean confidence interval for threads %-change: 56.83% 81.88%
Threads are helped.

total loops in shared programs: 128 -> 127 (-0.78%)
loops in affected programs: 1 -> 0
helped: 1
HURT: 0

total fills in shared programs: 684 -> 672 (-1.75%)
fills in affected programs: 160 -> 148 (-7.50%)
helped: 2
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 3fedf22b60 pan/bi: Tune lower_vars_to_scratch
Increase the threshold to lower indirect indexing of arrays to scratch memory
all the way up to 256 bytes, which was the lowest power-of-two threshold for
which enabling the pass on Mali-G57 was a win in shaderdb.

It's difficult to tell what threshold is optimal here. The shader-db stats are
based on a rough cycle model that assumes a 16:1 ratio between CVT and
load/store on Valhall, and a 24:1 ratio between arithmetic and load/store on
Bifrost. Those ratios are at most rules of thumb, as the number of cycles
required by a load/store instruction will vary tremendously based on caching and
the memory controller. However, they may well be lower bounds (if those are the
upper bounds on instruction issuing in the Mali shader cores). As such, a large
threshold seems well motivated.

shader-db results on Mali-G52 follow, results on Mali-G57 were similar. Note the
shader that's hurt for spills/fills is *helped* for load/store overall.

cycles helped: 129 -> 98 (-24.03%) (spills: 17 -> 20 (17.65%); fills: 34 -> 40 (17.65%))
ldst helped: 129 -> 98 (-24.03%) (spills: 17 -> 20 (17.65%); fills: 34 -> 40 (17.65%))

total instructions in shared programs: 2415410 -> 2415372 (<.01%)
instructions in affected programs: 1041 -> 1003 (-3.65%)
helped: 3
HURT: 0
helped stats (abs) min: 2.0 max: 31.0 x̄: 12.67 x̃: 5
helped stats (rel) min: 2.08% max: 6.02% x̄: 3.90% x̃: 3.60%

total tuples in shared programs: 1928558 -> 1928527 (<.01%)
tuples in affected programs: 826 -> 795 (-3.75%)
helped: 2
HURT: 1
helped stats (abs) min: 6.0 max: 26.0 x̄: 16.00 x̃: 16
helped stats (rel) min: 3.72% max: 9.68% x̄: 6.70% x̃: 6.70%
HURT stats (abs)   min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.54% max: 1.54% x̄: 1.54% x̃: 1.54%

total clauses in shared programs: 355013 -> 354981 (<.01%)
clauses in affected programs: 220 -> 188 (-14.55%)
helped: 3
HURT: 0
helped stats (abs) min: 2.0 max: 27.0 x̄: 10.67 x̃: 3
helped stats (rel) min: 13.99% max: 21.43% x̄: 16.93% x̃: 15.38%

total cycles in shared programs: 166610.27 -> 166574.90 (-0.02%)
cycles in affected programs: 138 -> 102.62 (-25.63%)
helped: 3
HURT: 0
helped stats (abs) min: 0.4583330000000001 max: 31.0 x̄: 11.79 x̃: 3
helped stats (rel) min: 15.28% max: 65.28% x̄: 34.86% x̃: 24.03%

total arith in shared programs: 73690.13 -> 73690.58 (<.01%)
arith in affected programs: 29.71 -> 30.17 (1.54%)
helped: 1
HURT: 2
helped stats (abs) min: 0.0833339999999998 max: 0.0833339999999998 x̄: 0.08 x̃: 0
helped stats (rel) min: 3.85% max: 3.85% x̄: 3.85% x̃: 3.85%
HURT stats (abs)   min: 0.125 max: 0.4166659999999993 x̄: 0.27 x̃: 0
HURT stats (rel)   min: 1.66% max: 5.17% x̄: 3.42% x̃: 3.42%

total ldst in shared programs: 135611 -> 135571 (-0.03%)
ldst in affected programs: 138 -> 98 (-28.99%)
helped: 3
HURT: 0
helped stats (abs) min: 3.0 max: 31.0 x̄: 13.33 x̃: 6
helped stats (rel) min: 24.03% max: 100.00% x̄: 74.68% x̃: 100.00%

total quadwords in shared programs: 1674599 -> 1674523 (<.01%)
quadwords in affected programs: 838 -> 762 (-9.07%)
helped: 3
HURT: 0
helped stats (abs) min: 2.0 max: 65.0 x̄: 25.33 x̃: 9
helped stats (rel) min: 3.39% max: 15.00% x̄: 9.14% x̃: 9.04%

total spills in shared programs: 37 -> 40 (8.11%)
spills in affected programs: 17 -> 20 (17.65%)
helped: 0
HURT: 1

total fills in shared programs: 190 -> 196 (3.16%)
fills in affected programs: 34 -> 40 (17.65%)
helped: 0
HURT: 1

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig fd021a618f pan/va: Replace MKVEC.v4i8 with MKVEC.v2i8
This is the instruction that the hardware actually supports. Do the rename, use
the more specific accurate model in the IR, and rework the Valhall texturing
code to emit MKVEC.v2i8 instead of MKVEC.v4i8.

Will fix:

   dEQP-GLES31.functional.texture.gather.offset_dynamic.implementation_offset.*

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig c570693c19 pan/va: Pack MKVEC.v2i8 byte lanes
They are in a different place, but the encoding is otherwise as usual. This will
be required for texture gathers with dynamic offsets.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 10301885ab pan/bi: Constant fold MKVEC.v2i8
Constant MKVEC.v2i8 will be generated during texturing on Valhall, just like
constant MKVEC.v4i8 is currently generated.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 2833d0472a pan/bi: Model MKVEC.v2i8
Valhall does not have Bifrost's 4-source MKVEC.v4i8. Instead, it has a (somewhat
limtied) 3-source MKVEC.v2i8. The full MKVEC.v4i8 may be lowered to a pair of
MKVEC.v2i8 instructions.

For good code quality on both Bifrost and Valhall, we need to model both
instructions in their full generality.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 6792b15971 pan/bi: Remove FRSCALE from IR
It's just LDEXP in different clothing.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 21bedd2c97 pan/va: Rename RSCALE to LDEXP
This avoids needless variation from Bifrost. While at it, fix the opcode
definition: there are no abs/neg/swizzle modifiers on the signed integer source,
and there's no clamp. However, there are round and infinity modes, like on
Bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 0da28ee2c7 pan/va: Implement sample positions FAU packing
This will fix:

dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.at_sample_position.default_framebuffer

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 9dd0bc92b5 pan/va: Lower FADD_RSCALE.f32 to FMA_RSCALE.f32
We generate FADD_RSCALE.f32 in our sample variables implementations. Valhall
doesn't have a dedicated FADD_RSCALE.f32 implementation, it should be aliased to
FMA_RSCALE.f32. Handle that alias in isel lowering. This will fix:

   dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_offset.*

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 1a882ecdab pan/bi: Align accesses with packed TLS
When lowering vars to scratch, we need to be careful with alignment on Valhall,
where packed TLS access must not straddle a 16-byte boundary. Fixes regressions
when enabling indirect access to temps on Valhall.

Fixes: 6761dbf891 ("panfrost: Use packed TLS on Valhall")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 5ee1179c94 pan/bi: Fix LD_BUFFER.i16 definition
This was missing the message, breaking UBO-to-push and who-knows-what-else, when
enabling fp16 const buffers.

Fixes: 3dc2095b07 ("pan/bi: Model LD_BUFFER instructions")
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17101>
2022-06-21 22:42:34 +00:00
Alyssa Rosenzweig 40accfd3b7 pan/va: Unit test va_mark_last
This pass is super easy to unit test, so we have no excuse not to test
thoroughly. va_mark_last only inserts annotations in a shader without any
annotations, so our test cases are simply annotated shaders. The CASE macro just
has to compare the case against the case with the annotations stripped and added
back with va_mark_last.

In retrospect, I should have used that technique for the flow control insertion
tests too.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig 4b7e337b45 pan/va: Mark last register reads
On Valhall, register reads may be marked as "last" [1]. Setting the last flag
promises the hardware that the value of the register is no longer required. This
may enable hardware optimizations. In particular, it may permit the hardware to
avoid register file writes if a write to the marked register is still in the
forwarding buffer. This may improve power efficiency.

In principle, this is trivial: run liveness analysis and mark killed sources,
like we would in an SSA-based register allocator. In practice, there are a few
wrinkles to avoid hazards around staging registers and 64-bit register pairs,
requiring some additional data flow analysis and fix ups. However, nothing here
is particularly "hard", and all the ideas are already in use for the Bifrost
scheduler and the Bifrost/Valhall scoreboard analyses.

[1] In Mesa's compiler, this is called discard for historical reasons.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig d4377e1255 pan/va: Use validate_register_pair for BLEND pack
Instead of open-coding. Noticed by inspection.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig b48933d641 pan/va: Include BLEND for va_swap_12
This helps "contain the crazy" and avoids special casing BLEND in compiler
passes. The Valhall instruction is roughly the same as its Bifrost counterpart,
as long as we fix up the source order (as we already do for bitwise operations)
everything works out.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig 738a1572d2 pan/va: Move va_flow_is_wait_or_none to common
We want to use this helper in the "mark last" pass too.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig 1b29a99b7b pan/va: Add header guards to valhall_enums.h
Otherwise we can't #include in multiple places.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig c5a8736552 pan/bi: Constify bi_is_staging_src argument
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig 2075bff4e8 pan/bi: Mark bi_postra_liveness_ins as MUST_CHECK
Post-RA liveness relies on the caller updating the live variable with the
results of bi_postra_liveness_ins. It is not automatic, as with regular
liveness. This means ignoring the result of bi_postra_liveness_ins is surely an
error. Mark it as MUST_CHECK to catch that error at compile time.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig 43d00c2971 pan/va: Unit test barrier handling
Add a unit test for the quirk discovered in the previos commit, because this
will cause flakes (instead of fails) if we get it wrong. Better have a
deterministic fail mode.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig 8c6b9b9c92 pan/va: Workaround quirk of barrier handling
For some unknown reason, waiting for general slots (at least for memory stores)
doesn't work properly on a BARRIER instruction. We need to wait for all general
slots right before issuing the BARRIER in addition to the general wait on the
BARRIER itself. I don't know if this is a hardware bug or some hideous
gate-saving quirk, but I observe the Mali-G78 DDK using the same workaround,
which implies this really is necessary.

Fixes rare flakes in:

   dEQP-GLES31.functional.compute.shared_var.work_group_size.float_128_1_1

Note that the flakes from that test are extremely timing dependent. Without this
change, that test is racy but we almost always win the race. Reproducing the
issue reliably requires high system load (e.g. running the CTS in the
background) and simultaneously running that test a large number of times.

Minimal shader-db impact. In particular, no cycle count regressions.

total instructions in shared programs: 2699419 -> 2699458 (<.01%)
instructions in affected programs: 22014 -> 22053 (0.18%)
helped: 2
HURT: 25
helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12%
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 1.64 x̃: 1
HURT stats (rel)   min: 0.07% max: 2.82% x̄: 0.69% x̃: 0.49%
95% mean confidence interval for instructions value: 1.01 1.87
95% mean confidence interval for instructions %-change: 0.38% 0.88%
Instructions are HURT.

total cvt in shared programs: 14468.81 -> 14469.42 (<.01%)
cvt in affected programs: 221.33 -> 221.94 (0.28%)
helped: 2
HURT: 25
helped stats (abs) min: 0.015625 max: 0.015625 x̄: 0.02 x̃: 0
helped stats (rel) min: 0.18% max: 0.18% x̄: 0.18% x̃: 0.18%
HURT stats (abs)   min: 0.015625 max: 0.046875 x̄: 0.03 x̃: 0
HURT stats (rel)   min: 0.10% max: 4.44% x̄: 1.06% x̃: 0.79%
95% mean confidence interval for cvt value: 0.02 0.03
95% mean confidence interval for cvt %-change: 0.57% 1.36%
Cvt are HURT.

total quadwords in shared programs: 1462496 -> 1462528 (<.01%)
quadwords in affected programs: 4632 -> 4664 (0.69%)
helped: 0
HURT: 4
HURT stats (abs)   min: 8.0 max: 8.0 x̄: 8.00 x̃: 8
HURT stats (rel)   min: 0.35% max: 7.69% x̄: 4.03% x̃: 4.03%
95% mean confidence interval for quadwords value: 8.00 8.00
95% mean confidence interval for quadwords %-change: -2.71% 10.76%
Inconclusive result (%-change mean confidence interval includes 0).

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig 7fa545528d pan/va: Simplify insert flow tests
Test cases for insert flow are necessarily the reference test cases with the
NOPs stripped out. That means we don't need to duplicate the test bodies.
Deduplicate.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Alyssa Rosenzweig 35fcf8d3d7 pan/va: Move VA_NUM_GENERAL_SLOTS to common
This definition is a hardware property. It's not specific to the flow control
insertion pass, so move it to common code where other passes can use it.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17091>
2022-06-21 22:19:59 +00:00
Emma Anholt 5f09b1ebe9 ci/bare-metal: Add test phase timeouts to all boards.
This should help with "marge got stuck for an hour and all I got was this
failed job with no results/" when a system intermittently wedges.

This replaces the BM_POE_TIMEOUT ("did we get something on serial in the
last 3 minutes?") that rpi had, in favor of checking that the whole test
job gets through in 20 minutes.

Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
2022-06-21 21:38:25 +00:00
Danylo Piliaiev 909e7aaf57 tu: Reset xfb_used at the end of a renderpass
Otherwise xfb_used could be true until the end of command buffer,
which is not what we intended it to be.

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17162>
2022-06-21 21:15:10 +00:00
Emma Anholt 086faecbba turnip: Document some fields about resolves.
I noticed the unk12 pattern, and cwabbott and danylo had figured out some
more details.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17126>
2022-06-21 19:40:58 +00:00
Lionel Landwerlin efc398c722 vulkan/wsi: fix crash with debug names on swapchain
If you set a name of on a swapchain object, because the base object
struct has not been initialized with a VkDevice,
vk_object_base_finish() will segfault when trying to free the object
name.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: cb1e0db23e ("vulkan/wsi: Make wsi_swapchain inherit from vk_object_base")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17165>
2022-06-21 17:05:10 +00:00
Lionel Landwerlin 4f10eddf77 anv: fix index buffer emission
In the following case :

  vkCmdBindPipeline(compute_pipeline);
  vkCmdDispatch(...);
  vkCmdBindPipeline(graphics_pipeline);
  vkCmdBindIndexBuffer(buffer)
  vkCmdDraw(...);

We're emitting the 3DSTATE_INDEX_BUFFER instruction while the HW is
still in GPGPU mode, because we're dealing the pipeline selection to
vkCmdDraw().

Found while debugging Age Of Empire 4, HW is hung on
3DSTATE_INDEX_BUFFER instruction.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17153>
2022-06-21 16:24:10 +00:00
Timur Kristóf 21ea19d504 zink: Always enable depth clamping, make depth clipping independent.
Enabling depth clamping ensures that the Vulkan driver respects
the depth range that zink sets on viewport objects in zink_draw.

When depth clipping is required, use VK_EXT_depth_clip_enable to
enable that independently of depth clamping.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16929>
2022-06-21 15:44:54 +00:00
Timur Kristóf 82e08f6b1e zink: Enable the VK_EXT_depth_clip_enable extension.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16929>
2022-06-21 15:44:54 +00:00
Timur Kristóf 810135fb42 gallium/u_blitter: Fix depth.
Fix the transform to make sure it doesn't disturb the depth range
of the blitted image. Set the Z coordinates of the vertices
by hand instead of relying on the transform to do it.

This is a pre-requisite to Zink always enabling depth clamping.

Fixes: 26c6640835
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16929>
2022-06-21 15:44:54 +00:00
Sarah Walker ee491967c3 pvr: Update for firmware 1.17@6256262
Signed-off-by: Sarah Walker <sarah.walker@imgtec.com>
Reviewed-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17139>
2022-06-21 15:09:10 +00:00
Danylo Piliaiev 48540caec9 tu: Disable sample counting for 3d blits during occlusion query
Per Vulkan spec only "Draw" commands should be counted towards
occlusion query.

Apparently RB_SAMPLE_COUNT_CONTROL::UNK0 bool controls whether
sample counting is enabled, so we could use it to disable
sample counting for 3d blits which are sometimes used for
clear/copy/blit/gmem-store/resolve operations.

Fixes GL CTS tests running through Zink:
 dEQP-GLES3.functional.occlusion_query.depth_clear
 dEQP-GLES3.functional.occlusion_query.depth_clear_stencil_clear
 dEQP-GLES3.functional.occlusion_query.scissor_depth_clear_stencil_clear
 dEQP-GLES3.functional.occlusion_query.scissor_stencil_clear
 dEQP-GLES3.functional.occlusion_query.stencil_clear

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6559

Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17138>
2022-06-21 13:13:36 +00:00
Gert Wollny 0c3fae4e6e virgl: Don't let ntt optimize the register allocation
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15995>
2022-06-21 11:24:09 +00:00