Replaces some FILE_ADDRESS with LAST_REGISTER_FILE and makes RA not choke
on instructions using TS values.
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11061>
Zink uses this, as it doesn't need to differentiate all the entrypoints.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11045>
Set st_rps_bits in hevc message buffer and set corresponding flag to indicate
that st_rps_bits will be used for parsing the short_term_ref_pic_set structure.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10889>
vdpau interface doesn't provide st_rps_bits, it uses NumDeltaPocsOfRefRpsIdx
instead. So disabling the flag to indicate st_rps_bits will not be used.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10889>
Get st_rps_bits from VAPictureParameterBufferHEVC, and set the flag that
indicates st_rps_bits will be used for parsing the short_term_ref_pic_set
structure
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10889>
st_rps_bits is used for accelorater to skip parsing the short_term_ref_pic_set
structure, which is needed for HEVC decode.
Signed-off-by: Boyuan Zhang <boyuan.zhang@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10889>
When drawing using util_translate_prim_restart_ib, zink explicitly
ignores pipe_draw_start_count_bias::start, because
util_translate_prim_restart_ib used to create a new index-buffer without
padding at the start.
This makes a lot of sense, because creating a padded index buffer is
just wasteful.
So let's walk back on the choice of starting to pad the output buffer.
Fixes: 1272c2e052 ("util/prim_restart: fix util_translate_prim_restart_ib")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4851
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11059>
OpenGL allows reinterpreting 2D array textures as cubemaps,
so we need to always set the cube-compatible flag for them.
This fixes a piglit test on Lavapipe.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10913>
We need to check if the number of samples are valid for the image
first if we've going to set the cube-compatible bit.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10913>
Now that we handle the cubemap-array views properly, we no longer need
to do this. It's allowed in Gallium to create cubemap views of 2d
arrays, so this should work fine.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10913>
Spec requires apps to use the size returned from
vkGetAndroidHardwareBufferPropertiesANDROID for AHB import, which
includes dedicated image/buffer import and non-dedicated buffer import.
Spec requires venus to use the size from image and buffer memory
requirement for dma_buf fd import if it's dedicated. If not dedicated,
the actual payload size should be used.
For AHB export allocation of VkImage, the app provided size will be 0,
and internally we must use the size from image memory requirement.
For AHB export allocation of VkBuffer, the app provided size comes from
buffer memory requirements and it can be smaller than the actual AHB
size due to page alignment. Internally that's the size we must use.
For AHB import, app must use the size from AHB prop query. Internally,
we have to override that with the size from memory requirement if the
import operation is dedicated to a VkImage or a VkBuffer. If not
dedicated, the actual payload size should be used.
The not working scenario is:
1. App creates an AHB backed VkBuffer, and the exported AHB size is
larger than the buffer memory requirement (very common).
2. App imports the AHB without a dedicated VkBuffer. Then the entire
AHB payload will be imported and the mmap_size might increase.
Test: dEQP-VK.api.external.memory.android_hardware_buffer.*
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11055>
For importing an dma_buf fd, export info is not required. Leaving the
bo_flags missing VIRTGPU_BLOB_FLAG_USE_SHAREABLE bit as well as the
VIRTGPU_BLOB_FLAG_USE_CROSS_DEVICE bit. Upon querying fd properties,
DMA_BUF handle type will be specified which generates a new bo_flags
not matching the prior one.
This patch aligns the above 2 bits for the dma_buf import path.
Test: vkGetAndroidHardwareBufferPropertiesANDROID should succeed with
a valid AHB with AHARDWAREBUFFER_FORMAT_BLOB format.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11055>
It is valid to not have a sampler view declaration for the corresponding
sampler in a TGSI shader, and hence we should not rely on the sampler view
declaration to determine if we need to adjust the unnormalized coordinates
for texture rectangle sampling.
This patch is to prep for tgsi shaders that are translated from nir which
in many cases do not issue sampler view declarations.
Fixes: 584b107037 ("st/mesa: Drop the TGSI paths for drawpixels and use nir-to-tgsi")
Reviewed-by: Neha Bhende <bhenden@vmware.com
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11011>
Sometimes, TGSI shader doesn't have SVIEW declaration if it is not
utilize in shader. In such cases, declare those resources with the
help of information stored in shader key.
Fixes: 584b107037 ("st/mesa: Drop the TGSI paths for drawpixels and use nir-to-tgsi")
Tested with piglit, gleretrace
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11011>
NumUpdates is used to indicate whether a VAO is static or dynamic, but if
we restored an older value, it could incorrectly indicate that it's not
dynamic.
This fixes a small performance issue with torcs.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10994>
This improves performance in torcs by 6%.
The idea is to skip saving and restoring vertex attribs and bindings that
have never been changed.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10994>
this hook usually requires mapping a resource, which will trigger a
trace dump internally, which requires locking the output mutex, so
finish the trace dump before calling it
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11052>
A simple post-RA optimization which takes advantage of the
s_cbranch_vccz and s_cbranch_vccnz instructions.
It works on the following pattern:
vcc = v_cmp ...
scc = s_and vcc, exec
p_cbranch scc
The result looks like this:
vcc = v_cmp ...
p_cbranch vcc
Fossil DB results on Sienna Cichlid:
Totals from 4814 (3.21% of 149839) affected shaders:
CodeSize: 15371176 -> 15345964 (-0.16%)
Instrs: 3028557 -> 3022254 (-0.21%)
Latency: 21872753 -> 21823476 (-0.23%); split: -0.23%, +0.00%
InvThroughput: 4470282 -> 4468691 (-0.04%); split: -0.04%, +0.00%
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
This commit adds the skeleton of a new ACO post-RA optimizer,
which is intended to be a simple pass called after RA, and
is meant to do code changes which can only be done
after RA.
It is currently empty, the actual optimizations will be added
in their own commits. It only has a DCE pass, which deletes
some dead code generated by the spiller.
Fossil DB results on Sienna Cichlid:
Totals from 375 (0.25% of 149839) affected shaders:
CodeSize: 2933056 -> 2907192 (-0.88%)
Instrs: 534154 -> 530706 (-0.65%)
Latency: 12088064 -> 12084907 (-0.03%); split: -0.03%, +0.00%
InvThroughput: 4433454 -> 4432421 (-0.02%); split: -0.02%, +0.00%
Copies: 81649 -> 78203 (-4.22%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7779>
This reverts commit eef5409df4.
I suspect this has caused a lot of the CI instability today -- some flakes
were already added, but the a630-traces job is still flaking. Revert
until a fix makes it stable.
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11024>