Undefine __STDC_VERSION__ for C files to avoid mismatch with C11/C17.
Define __STDC_VERSION__ for C++ files to use <inttypes.h> path.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10196>
Otherwise, handle_loop_phis() might pass it to handle_live_in() and then
we could have two phis for this variable.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Fixes: 7c64623e94 ("aco/ra: refactor SSA repairing during register allocation")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10236>
Prior to supporting VK_EXT_descriptor_indexing all of our descriptor
limits where below 64k which fitted a uint16_t. Now all of those can
go up to 2^20 entries so we need 32bits indexes to keep track of them.
This change leaves the dynamic indexes at 16bits. We could arguably
bump them too, up to the reviewer's taste.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 6e230d7607 ("anv: Implement VK_EXT_descriptor_indexing")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4636
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10228>
Consider a simple loop that does a series of texture instructions and
then reduces the results:
vec4 sum = vec4(0);
for (int i = 0; i < N; i++) {
sum += texture(...);
}
Assume that the loop is unrolled and we schedule the resulting basic
block. Right now, after we schedule the first texture instruction, the
only instructions available to schedule that don't incur a sync are the
instructions to setup the second texture instruction. So we keep picking
the texture instructions, no matter how large N is, resulting in a
pathological schedule for register pressure when N is very large:
sum1 = texture(...);
sum2 = texture(...);
sum3 = texture(...);
...
sum = sum1 + sum2 + sum3 + ...;
In particular this happens with some CTS tests for VK_EXT_robustness2,
where a loop like that with many iterations is marked as [[unroll]],
forcing NIR to unroll it.
This solution is a balance between the current approach and always
scheduling for register pressure (and ignoring sync's). We only allow a
certain number of texture fetches to be in flight before considering
textures to "sync", even though they don't really, both because they
likely *will* sync in reality (overflowing the internal queue of waiting
texture instructions) and because at some point we need the normal
algorithm to kick in and start lowering register pressure.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7571>
Once we insert a use of a given tex or SFU instruction, then we must
wait for that tex/SFU instruction (as well as all earlier ones) to
complete, so we shouldn't penalize further uses, even if a subsequent
tex/SFU instruction gets scheduled after the first use. This especially
matters after the next commit when we start forcibly breaking up long
sequences of texture instructions, since if we schedule a group of 8
texture instructions then we want to schedule the uses of those
instructions in parallel with the next 8 texture instructions to reduce
register pressure.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7571>
Similar to the previous commit, we should also verify that the
source-format support linear-filter if we try to blit with it.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10234>
Some Vulkan-drivers don't support blitting between all formats and
layouts. So let's verify this while blitting, and fall back to the
normal rendering code-path instead.
This fixes a crash on start-up in OpenArena on V3DV.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10234>
In most games I tested we use 32 MiB of cmdbuffers+cmd upload buffers
at most. Especially since we have mutable descriptors it seems
somewhat unlikely anything else will eat it up so be a bit more
aggressive allocating them in VRAM.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10198>
If first_frame_done isn't set, but fence is NULL, we end up dereferncing
that NULL-pointer.
This can happen in the case where the first submitted batch has no work,
and pfence was passed as a NULL-pointer.
While we're at it, simplify the check with the surrounding code, which
also checks for a NULL-pointer here.
Fixes: e93ca92d4a ("zink: force explicit fence only on first frame flush")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10235>
Late export is theoretically better if used with LATE_ALLOC,
but in practice, the early export has an advantage of
lower register usage, therefore more concurrent waves.
The idea of this commit is that "small" shaders benefit from early
primitive export more, due to being able to launch much more waves.
Let's consider a NIR shader "small" when it has only 1 block.
This yields both better performance, and better stats, than always
using late export.
Fossil DB on Sienna:
Totals from 12807 (8.76% of 146265) affected shaders:
VGPRs: 609128 -> 620216 (+1.82%); split: -0.01%, +1.83%
SpillSGPRs: 1458 -> 1538 (+5.49%)
CodeSize: 37028204 -> 37019320 (-0.02%); split: -0.17%, +0.14%
MaxWaves: 282902 -> 278516 (-1.55%)
Instrs: 7163142 -> 7162925 (-0.00%); split: -0.18%, +0.18%
VClause: 169285 -> 169547 (+0.15%); split: -1.15%, +1.30%
SClause: 267373 -> 267151 (-0.08%); split: -0.24%, +0.16%
Copies: 446442 -> 444567 (-0.42%); split: -2.68%, +2.26%
Branches: 156245 -> 156195 (-0.03%); split: -0.30%, +0.26%
PreSGPRs: 434701 -> 447396 (+2.92%)
PreVGPRs: 527783 -> 540527 (+2.41%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>
The user-set priority of shaders matters very little, but we hope
this might still help speed up VS input loads especially.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>
We learned that the gs_alloc_req is not actually when the export
space allocation happens. So it makes no sense to prioritize it.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10106>
This fixes basic rendering on top of V3DV, which doesn't seem to expose
the cached memory we expect and love.
Fixes: 598dc3dca4 ("zink: use cached memory for all resources when possible")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10230>
Previously, every wave had multiple active lanes read the LDS, and
the data was processed by VALU DPP instructions.
Now, only the first lane reads the LDS in order to avoid bank
conflicts, and the results are processed by SALU.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10155>
The key passed to _mesa_hash_table_search() is wrong, fix it.
Fixes: 8ba2f9f698 ("panfrost: Create a blitter library to replace the existing preload helpers")
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10232>
This seems to simply be a mixup of what utility function to use.
util_clear_render_target clears on the CPU, whereas
util_blitter_clear_render_target clears on the GPU. Because we do the
zink_blit_begin dance, it seems reasonable to assume the latter was
intended.
Fixes: 622f8f6ed5 ("zink: add a pipe_context::clear_texture hook")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10211>
This is possible again thanks to
https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9955 , and
this MR requires rebuilding all templates based docker images anyway,
so we can pull in the latest templates for free.
We need to exclude /dev/* when unpacking rootfs tarballs for the
arm_test image, since x86 container build jobs do not allow mknod
anymore with current templates. The baremetal test jobs have another
filesystem mounted on /dev anyway.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9833>
Among other things, this gets us GCC 10 (was 6).
Requires some changes to third party components we use:
* Install apitrace (& waffle) from Debian; was hitting issues with the
local build, and it's the same version 9.0 anyway.
* Update Fossilize to a newer commit which builds with GCC 10.
* apt.llvm.org repositories are no longer needed.
* Use an SPIRV-LLVM-Translator commit which builds with LLVM 11.0.1.
* Install XCB packages from Debian, 1.13 fails to build with Python 3.9.
* Install wayland-protocols from Debian, 1.12 is too old for
libgtk-3-dev in bullseye.
LLVM 7/8 packages are no longer available.
Also adapt expected test results to Xvfb now exposing multi-samle
GLXFBConfigs.
v2:
* Install clang instead of clang-11.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3124
Reviewed-by: Eric Anholt <eric@anholt.net> # v1
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9833>
Avoids warning with GCC 10:
../src/intel/blorp/blorp_blit.c: In function 'blorp_nir_combine_samples':
../src/intel/blorp/blorp_blit.c:702:25: error: 'texture_data[0]' may be used uninitialized in this function [-Werror=maybe-uninitialized]
702 | texture_data[0] = nir_fmul(b, texture_data[0],
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
703 | nir_imm_float(b, 1.0 / tex_samples));
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9833>
Avoids warning with newer GCC:
../src/gallium/drivers/r600/sb/sb_sched.cpp: In member function 'void r600_sb::literal_tracker::reset()':
../src/gallium/drivers/r600/sb/sb_sched.cpp:1953:26: error: 'void* memset(void*, int, size_t)' clearing an object of non-trivial type 'struct r600_sb::literal'; use assignment or value-initialization instead [-Werror=class-memaccess]
1953 | memset(lt, 0, sizeof(lt));
| ^
In file included from ../src/gallium/drivers/r600/sb/sb_sched.cpp:35:
../src/gallium/drivers/r600/sb/sb_bc.h:409:8: note: 'struct r600_sb::literal' declared here
409 | struct literal {
| ^~~~~~~
[ Michel Dänzer:
* Expanded commit log
v2:
* Clear all 4 members of lt[4] (Eric Anholt)
]
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9833>
No fossil-db changes, probably because all fp16 shaders have at least one
16-bit mov or vec2 somehwere.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10227>
VK_KHR_spirv_1_4 is trivial because vtn already supports all the added
SPIR-V features that aren't gated behind Vulkan extensions. I've
observed some robustness2 CTS tests requiring this. However there are
a few tests currently failing due to lacking spilling.
VK_EXT_scalar_block_layout should also be trivial, since support for
"straddling" UBO loads was added recently for other reasons. This is
used by every robustness2 CTS test.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8695>
64 was a temporary and conservative "big enough" value, but we can do
better.
Note that as mentioned on the FIXME, we could be even more detailed,
adding a descriptor map allocate method based on the descriptor
type. That would mean more individual allocations, and slightly more
complexity.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10207>
We had some cases were we have defined a value on v3dv_limits but
using other when setting it at GetPhysicalDeviceProperties (like
dynamic storage buffers).
Also we do a cleanup. So far we were adding on v3dv_limits only the
limits that were used on more that one place. But then we had the
definition of several limits on different places. It is clearer to
have a common place for those, even if it is used on just one place.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10207>
There were two problems here:
* We were multiplying by 6, when for graphics pipelines, we only
support 2.
* Right now we are tracking descriptors through the descriptor
maps, and we have one per pipeline. So in practice there is no
difference between per-stage and per-pipeline limits. So far this
was not a problem, we could revisit in the future.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10207>
Layered VRS attachments is for later.
The CTS failures are similar to the existing ones, I will investigate
soon.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10187>