This flag is used by the nv50, r600, and svga backends for instruction
exactness. It was easier to plumb it in as an override in tgsi_ureg than
to make all of ALU instruction emit do it.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14201>
Cheatsheet:
_mesa_has_ARB_texture_cube_map() becomes (true &&
ctx->Extensions.Version >=
_mesa_extension_table[...].version[ctx->API]). The last value is 0 when
ctx->API is API_OPENGL_COMPAT and ~0 otherwise. The whole function
effectively becomes (ctx->API == API_OPENGL_COMPAT).
_mesa_has_OES_texture_cube_map() becomes (true &&
ctx->Extensions.Version >=
_mesa_extension_table[...].version[ctx->API]). The last value is 0 when
ctx->API is API_OPENGLES and ~0 otherwise. The whole function
effectively becomes (ctx->API == API_OPENGLES).
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14203>
Having an integer destination type instead of a float destination type
confuses the SWSB code. This causes problems on some Intel GPUs. Fix
this by using the correct type in the destination of the F32TOF16
opcode.
Gfx7 doesn't have the HF type, so continue to emit W on that platform.
The assertions in brw_F32TO16 (brw_eu_emit.c) are updated to reflect
this. In scalar mode, UD is never emitted as a destination type for
this opcode, so remove it from the allowed types in the assertion.
I also condidered doing something like de55fd358f ("intel/fs/xehp:
Teach SWSB pass about the exec pipeline of
FS_OPCODE_PACK_HALF_2x16_SPLIT."), but Curro recommended that just using
the correct types is a better fix. I agree.
v2: Add missing changes to fs_generator::generate_pack_half_2x16_split.
I'm not sure how I (and the Intel CI) missed that the first time. :(
v3: Fix copy-and-paste issue in the v2 fix. Noticed by Tapani.
Reviewed-by: Francisco Jerez <currojerez@riseup.net> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14181>
Shmems are allocated internally and are only for CPU access. They can
be easily cached.
Venus have 4 sources of shmem allocations
- the ring buffer
- the reply stream
- the indirection submission upload cs
- one cs for each vn_command_buffer
The first one is allocated only once. The other three reallocate
occasionally. The frequencies depend on the workloads.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14179>
It suballocates from a shmem pool owned by vn_instance. The goals are
to speed up shmem allocations for VkCommandBuffer and to reduce the
number of BOs. Both are crucial when shmems are HOST3D BOs, because
they require roundtrips to the renderer to allocate and they take up KVM
memslots.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14179>
It provides shmem suballocations. It is designed to be used with
short-lived shmems. A long-lived shmem can hold on to some large
allocation while only using a likely small region of the large
allocation.
v2: cleanups suggested by Yiwei
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com> (v1)
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14179>
Move helpers built on top of vn_renderer.h to the new files.
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Reviewed-by: Yiwei Zhang <zzyiwei@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14179>
They are being used on integer to integer stores. From Vulkan sec,
final paragraph of 16.4.4 "Texel Output Format Conversion":
"Each component is converted based on its type and size (as
defined in the Format Definition section for each
VkFormat). ... Integer outputs are converted such that their value
is preserved. The converted value of any integer that cannot be
represented in the target format is undefined."
I didn't find a equivalent quote for OpenGL as all conversion entries
are forcused on float to integer, fixed-point to integer, etc, and not
on integer to integer. Didn't find any test failure with this change.
We didn't get any shader-db stats change with shaderdb (even
overriding to OpenGL 4.4 to get more shaders built), so as a reference
Vulkan shader-db stats with the pattern
dEQP-VK.image.*.with_format.*.*
total instructions in shared programs: 37534 -> 36522 (-2.70%)
instructions in affected programs: 12080 -> 11068 (-8.38%)
helped: 241
HURT: 0
Instructions are helped.
total uniforms in shared programs: 9100 -> 8550 (-6.04%)
uniforms in affected programs: 3004 -> 2454 (-18.31%)
helped: 229
HURT: 0
total max-temps in shared programs: 6110 -> 6014 (-1.57%)
max-temps in affected programs: 402 -> 306 (-23.88%)
helped: 43
HURT: 0
Max-temps are helped.
total nops in shared programs: 1523 -> 1526 (0.20%)
nops in affected programs: 21 -> 24 (14.29%)
helped: 3
HURT: 6
Inconclusive result (value mean confidence interval includes 0).
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14194>
When the BO is NULL, AMDGPU will reset the PTE VA range to the initial
state. Otherwise, it will first unmap all existing VA that overlap the
requested range and then map. This seems better than using MAP/UNMAP.
This reduces stuttering in Forza Horizon 5.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14116>
This shouldn't be necessary because applications have to manage
resources and memory themselves.
This also prevented memory to be freed if an application doesn't unbind
a sparse memory object and free it, which is legal as long as the
resource isn't used afterwards.
This was introduced to unmap the sparse mappings when destroying
a virtual BO, but now that the driver uses OP_CLEAR it's no longer
needed.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14116>
When we get into create_beginend_table, ctx->Exec only contains nops
set by _mesa_alloc_dispatch_table. This function calls
_mesa_alloc_dispatch_table too, so table and ctx->Exec are identical,
and then it copies identical entries from one table to the other.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14000>
Function pointers were first set in GLvertexformat, and then
GLvertexformat was copied to the dispatch.
This just sets the function pointers in the dispatch directly,
skipping the intermediate GLvertexformat structure.
The code with SET_* calls is autogenerated by api_vtxfmt_init_h.py.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14000>