Use the stack. (note: we already do for drm_msm_gem_submit_cmd array, and
using calloc() for heap allocations in a VK driver is wrong
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6687>
Make it less error-prone and more consistent with other helpers.
Pass the masks as a single argument rather than two.
In wave64 mode, split the argument into low and high halves in
emit_mbcnt rather than where it is called.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6699>
GLSL Desktop spec 1.30.x:
"New built-ins: trunc(), round(), roundEven(), isnan(), isinf(), modf()"
For ES, 3.00.x is the first ES spec that mentions the builtins.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6455>
It seems that the mali400 pp is unable to load vec3 unaligned varyings.
This can happen in the current state with mesa if a varying float is put
into the first component of a vec4 and a vec3 is packed right after it.
This would be fine as by default nir would create a vec4 load followed
by a mov with swizzle to realign the components into a vec3.
In lima_nir_split_load_input, this becomes a separate vec3 load
expecting the unaligned load.
Since this can't happen, skip the load input splitting for this special
case.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6507>
Avoid having to deal with BO tracking. However, the kernel still requires a
bo list, so keep a global one which can be re-used for every submit.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6686>
This has been shown to help performance on TGL and DG1. This could be
applied to gen9+, but we still need to show if it helps with those
platforms.
Rework:
* Make change in src/intel/vulkan/genX_cmd_buffer.c too. (Ken)
* Keep mask as 3 for gen < 12
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6684>
Anv doesn't do multi-layer fast-clear tracking, but TGL may add
fast-clears to multiple layers. Disable CCS_E for image arrays on TGL+
until anv gets more clear color tracking abilities.
With this change, anv+TGL now passes:
* dEQP-VK.multiview.readback_implicit_clear.15_15_15_15
* dEQP-VK.multiview.readback_implicit_clear.8_1_1_8
* dEQP-VK.multiview.readback_implicit_clear.1_2_4_8_16_32
* dEQP-VK.multiview.renderpass2.readback_implicit_clear.15_15_15_15
* dEQP-VK.multiview.renderpass2.readback_implicit_clear.8_1_1_8
* dEQP-VK.multiview.renderpass2.readback_implicit_clear.1_2_4_8_16_32
v2. Mention HSD 14010672564. (Sagar)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6528>
syncobj wait takes int64_t timeout and won't clamp it
in kernel code, so we have to pass in INT64_MAX instead
of OS_TIMEOUT_INFINITE which is UINT64_MAX. Otherwise
syncobj wait with OS_TIMEOUT_INFINITE case just return
fail.
Fixes: c638301b42 "radeonsi: fix syncobj wait timeout"
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6676>
Pre-V3D 4.3 hardware has a quirk where it expects XY coordinates in
.8 fixed-point format, but then it will internally round it to .6 fixed-point,
introducing a double rounding. The double rounding can cause very slight
differences in triangle raterization coverage that can actually be noticed by
some CTS tests.
The correct fix for this as recommended by Broadcom is to convert to
.8 fixed-point with ffloor().
Fixes:
dEQP-VK.renderpass.suballocation.subpass_dependencies.late_fragment_tests.*
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6677>
This should be helpful if someone chooses to implement cache support on
windows. Also providing this greater level of abstraction makes it easier
to implement alterative cache layouts in future.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6197>
This pulls out the cache item writing code from cache_put() into
a new helper. In this patch we also move various functions called
by this code into the new disk_cache_os.c file.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6197>
This will make windows support easier to add in future. To avoid code
churn this temporarily duplicates the mkdir_if_needed() function, we
will delete the duplicate in a following patch.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6197>
Fix defect reported by Coverity Scan.
Missing break in switch (MISSING_BREAK)
unterminated_case: The case for value
nir_intrinsic_bindless_image_samples is not terminated by a 'break'
statement.
Suggested-by: Karol Herbst <kherbst@redhat.com>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6645>
v2: move *2*mp patterns to the end of late_optimizations
v3: remove ftrunc from the optimizations to fix:
dEQP-GLES3.functional.shaders.builtin_functions.common.modf.vec2_lowp_vertex
Reviewed-by: Rob Clark <robdclark@chromium.org> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6283>
If IO is lowered, NIR doesn't have to contain any IO variables
(and in fact radeonsi removes them and other drivers should too).
This makes the pass work without variables.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6621>
This is a slight generalization of the existing SGI and MESA swap
control extensions, and a prerequisite for GLX_EXT_swap_control_tear.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6671>
It wouldn't work in any case, as the internal API only stores a signed
int, and GLX_EXT_swap_control_tear will overload the meaning of negative
values so we should avoid ambiguity.
If your application needs a swap interval in excess of ~414.25 days, I'm
very sorry.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6671>
I find the __glX naming convention distasteful for a bunch of reasons,
not least that I expect "vi -t glXBindTexImageEXT" to take me someplace
useful. The functions we're referencing here should not be exported from
libGL (hence the static / _X_HIDDEN) but that's no reason not to name
them correctly.
This does have one possible, very minor, correct functional change,
glXGetMscRateOML now returns Bool (unsigned int) instead of GLboolean
(unsigned char); if your psABI really only writes to a single byte of
the return register when the return type is char-like, then we probably
would not have returned false when we meant to. At least for amd64 this
does not seem to be an issue; the old code wrote 0 to %eax, the new code
does a zero-extended load from %al to %eax (since the internal function
still returns GLboolean).
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6671>
They are using a in source tree version of the needed libdrm
functionality or are shipping all needed headers in the source tree.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6672>
This exploits a HW optimization for when only the size of a draw state is
changed, to make things simpler and more optimal (assuming a well behaved
user which doesn't unecessarily call CmdBindVertexBuffers many times)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6665>
Coverity complains about possible out-of-bounds write & read, because
it thinks that "loc + i" can be bigger than sizes of the 2 used arrays.
It's not obvious from the code it cannot happen, so add asserts here.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667>
Found by Coverity, as "argument cannot be negative", referring to
fread's 2nd argument.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6667>