This is to fix serious performance drop of texture_upload/
texture_resue relative items in chromeos glbench test.
Staging texture is not efficient for CPU uploading.
Signed-off-by: Jasber Chen <yipeng.chen@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13306>
It eliminates "False Sharing" for atomic operations. (see wikipedia)
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11618>
DCN only supports an extent < 4K on !64B && 128B.
Signed-off-by: Joshua Ashton <joshua@froggi.es>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13056>
tc already calculates all the rebinding that needs to be done on a given
context, so (some of) this info can be passed on to drivers to enable
optimizations
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11245>
Uncached access can be slow if the box is not aligned nicely.
Also, caching in L2 might enable bigger PCIe bursts.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
It's only used by buffers and only zink uses it privately for textures too.
This is part of removing u_resource_vtbl.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10659>
This is the initial step towards removing u_resource_vtbl.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10659>
Instead of having a special case for YUV formats in
si_query_dmabuf_modifiers, let ac_get_supported_modifiers handle
them. Keep setting external_only = 1 for YUV formats, since we
can only sample from such formats (we can't use them as render
targets).
This shouldn't change si_query_dmabuf_modifiers's behavior, because
for YUV formats ac_get_supported_modifiers will return a single
LINEAR modifier.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Leo Liu <leo.liu@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10134>
The retile map is removed and replaced by direct DCC address computations
in the retile shader using the new function ac_nir_dcc_addr_from_coord.
The RADV code is disabled.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
Set the final value in si_texture_create_object, so that other places
don't have to derive it redundantly.
The only thing to remember is that HTILE stencil can be enabled when
stencil is not present, and it can be disabled when stencil is present
due to various workarounds.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
Only one bit is set in alignments, so store the bit offset (log2) and
change the type from uint32_t to uint8_t.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
Images are always aligned to 256B (enforced by register and descriptor
fields) and limited to 40-bit addresses. This saves some space.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
Alignments are always 2^n, so store n = log2(alignment). The next commit
will take advantage of the saved space.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9809>
This will allow removing the winsys pointer from buffers.
The amdgpu winsys adds dummy_ws to get radeon_winsys because there can be
no radeon_winsys around (e.g. while amdgpu_winsys is being destroyed), but
we still need some way to call buffer functions.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9809>
This will allow removing the winsys pointer from buffers.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9809>
- no 3D and cube textures
- no mipmapping
- no border color
- image_sample is the only supported opcode with a sampler (behaves like _lz)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>
This was implemented in 1d3bffaf9c,
but missing the WRITE_COMPRESS_ENABLE bit, then disabled by
4dc6ed2a59040f04648eadbffeb1522587d00f3.
This commits reimplements it to:
- avoid disabling dcc when uploading FP16 textures
(see si_use_compute_copy_for_float_formats)
- being able to use compute to upload textures in more cases, rather
than using the blit path
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8958>
Decreasing the time spent in radeon_cs_memory_below_limit is the motivation.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8794>