This also needs another lines fix, but at least align the code
with tri and points
Cc: "20.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
Fixes one case in
dEQP-VK.rasterization.primitives_multisample_4_bit.no_stipple.points
Cc: "20.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
You have to count the stats pre-culling here.
Just like dc261cdd42 did for lines.
VK-GL-CTS dEQP-VK.query_pool.statistics_query.clipping_primitives*point_list
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
Fixes:
dEQP-VK.wsi.xcb.swapchain.acquire.too_many
Cc: "20.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
I'd only half ported private memory support, finish the job.
Cc: "20.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
Until recently, we ended up using u_blitter here, because
info->render_condition_enable was always true here. But when we recently
fixed that overly broad check, this broke.
So let's fix layered-resolves, by actually checking if the resource has
layers respect them in that case, similar to what we do in blit_native.
Fixes: 19906022e2 ("zink: more accurately track supported blits")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3843
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7737>
This optimizes v_add(c, v_lshlrev(a, b)) to v_mad_u32_u24(b, 1<<a, c)
if 'a' is a constant (less than or equal to 6 to avoid creating
literals) and 'b' known to be a 16-bit or a 24-bit value.
On GFX9+, this is already optimized to v_lshl_add_u32.
No fossils-db changes.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>
It will allow to combine v_add+s_lshl or v_add+v_lshlrev to
v_mad_u32_u24 on GFX6-8 if operands are known to be 16-bit or 24-bit.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>
To indicate that the upper 8-bits are always 0 to optimize more MADs.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>
It's less code and makes the configuration easier to fine tune.
Fixes: ff05da7f8d
("microsoft: Add CLC frontend and kernel/compute support to DXIL converter")
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7699>
The warning is a bit misleading about where it shows up.. it complains
about the shader key, due to shader key being calculated from (among
other things) stream_output state that had some uninitialized garbage
in the padding.
==84572== Uninitialised byte(s) found during client check request
==84572== at 0x60548E8: blob_write_bytes (blob.c:163)
==84572== by 0x6534EF7: compute_variant_key (ir3_disk_cache.c:111)
==84572== by 0x6535143: ir3_disk_cache_retrieve (ir3_disk_cache.c:171)
==84572== by 0x654D82F: create_variant (ir3_shader.c:251)
==84572== by 0x654DA2B: ir3_shader_get_variant (ir3_shader.c:301)
==84572== by 0x645B2CB: ir3_shader_variant (ir3_gallium.c:113)
==84572== by 0x645B7EB: ir3_shader_create (ir3_gallium.c:219)
==84572== by 0x645BAA7: ir3_shader_state_create (ir3_gallium.c:285)
==84572== by 0x6506003: fd6_shader_state_create (fd6_program.c:1136)
==84572== by 0x64676C7: assemble_tgsi (freedreno_program.c:105)
==84572== by 0x64679DF: fd_prog_init (freedreno_program.c:188)
==84572== by 0x6506157: fd6_prog_init (fd6_program.c:1172)
==84572== Address 0xeff1588 is 424 bytes inside a block of size 480 alloc'd
==84572== at 0x4866FA4: malloc (vg_replace_malloc.c:307)
==84572== by 0x605D46F: ralloc_size (ralloc.c:133)
==84572== by 0x605D52F: rzalloc_size (ralloc.c:166)
==84572== by 0x654DFF7: ir3_shader_from_nir (ir3_shader.c:473)
==84572== by 0x645B6C7: ir3_shader_create (ir3_gallium.c:182)
==84572== by 0x645BAA7: ir3_shader_state_create (ir3_gallium.c:285)
==84572== by 0x6506003: fd6_shader_state_create (fd6_program.c:1136)
==84572== by 0x64676C7: assemble_tgsi (freedreno_program.c:105)
==84572== by 0x64679DF: fd_prog_init (freedreno_program.c:188)
==84572== by 0x6506157: fd6_prog_init (fd6_program.c:1172)
==84572== by 0x64CB36F: fd6_context_create (fd6_context.c:154)
==84572== by 0x59D93BB: st_api_create_context (st_manager.c:917)
Somehow this was showing up with dEQP-GLES31.info.vendor but not other
things.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>
1) Propagate the change to only emit markers in debug builds (and add
the WFI that ensures they are synchronized with GPU. We could
consider dropping them entirely, since the GPU devcoredump support
in newer kernels is more useful. But it is still an occasionally
useful fallback.
2) Use p_atomic_inc_return() to placate helgrind
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>
v2: Verify that PIPE_CAP_FRONTEND_NOOP is available before calling vfunc (Icecream95)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7640>
Assigning an array reg removes IR3_REG_ARRAY, which means that
definitions and uses can't be tracked back to the array register's name
and liveness for the components of the array aren't correctly
calculated. To fix this we delay assigning array registers until the
scalar pass.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7711>
Caused to return early wrongly on CmdPushConstants with some tests
using several calls to that method. As we are here we are also
replacing the (void *) casting at the memcpy below.
Fixes: e1c8041cde ("v3dv: try harder to skip emission of redundant state")
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7718>
Sometimes UMR logs can't be dumped and you would get permission
denied, even if the UMR binary has the setuid bit enabled.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>
Like the name, version, as well as the engine and the API version.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>
If for some reasons the driver can't generate the hang report properly.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>
It can only be done if a TCS input is accessed without indirect indexing and
with gl_InvocationID as the vertex index, and the number of VS and TCS threads
is the same.
This eliminates LDS stores and loads for VS->TCS IO, reducing shader lifetime
and LDS traffic.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
Instead of:
if (VS) {
VS;
}
if (TCS) {
TCS;
}
Do this if the number of threads is the same in VS and TCS:
exec = enabled_threads;
VS;
TCS;
Skipping declare_vb_descriptor_input_sgprs is needed to match the VS return
values.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
Instead of:
if (TCS) {
TCS;
}
if (TCS && epilog) {
epilog;
}
Do:
if (TCS) {
TCS;
if (epilog) {
epilog;
}
Only monolithic shaders can do it.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
This increases occupancy when the LDS size is e.g. 20K for 3 waves.
If we limit the size to 16K, we can fit 2 workgroups with 2 waves each,
so 4 waves in total.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
Previously it was 16 and bigger patches would always trim the patch count
needlessly.
There are 2 variables to consider:
- lane occupancy
- LDS usage (limiting wave occupancy)
If LDS size is 32 KB (max limit per CU) for 3 waves and we can't maximize
occupancy, it's better to leave some lanes unoccupied because using 2
waves would decrease the LDS size to 21 KB, which is not enough to fit
another workgroup on the CU.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
With triangles and 3 HS waves, 3 lanes were unoccupied. Adjust the SGPR
encoding to allow 1 more triangle to fit there.
Some of the fields are not large enough, but they weren't large enough
before either.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
I think it only made the pass return false if there was a barrier
Fixes: 2832bc972b - ac/nir_to_llvm: add ac_are_tessfactors_def_in_all_invocs()
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>