avoids errors seen when building on OpenBSD/amd64
../src/amd/compiler/aco_instruction_selection.cpp:1677:62: error: ambiguous conversion for functional-style cast from 'unsigned long' to 'aco::Operand'
bld.vop3(aco_opcode::v_mul_f64, Definition(dst), Operand(0x3FF0000000000000lu), tmp);
^~~~~~~~~~~~~~~~~~~~~~~~~~~
glibc uses unsigned long for uint64_t on LP64 archs and unsigned long long for
uint64_t on ILP32 archs. On OpenBSD unsigned long long is used for uint64_t
on all archs.
The Operand constructors are uint8_t uint16_t uint32_t uint64_t
use UINT64_C so lu or llu suffix will be used as needed.
Fixes: df645fa369 ("aco: implement VK_KHR_shader_float_controls")
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7944>
The fp64 emulation is not prepared for vectorized 64 bit code, so
if the driver doesn't ask for lowering to scalar by itself, do it before
lowering to soft-fb, and run a vectorizazion pass afterwards.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7961>
The feature was exposed but completely ignored by the driver. Other
AMD drivers don't expose it as well, probably because it's complicated
to implement alpha-to-coverage properly. Let's disable it.
Cc: mesa-stable.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7966>
Merging primitives generates incorrect gl_PrimitiveID[In] values.
So make merged primitives construction non-destructive and fallback
to drawing with original primitives if a program reads gl_PrimitiveId.
This commit adds _mesa_update_primitive_id_is_unused modeled after
_mesa_update_allow_draw_out_of_order to update ctx->_PrimitiveIDIsUnused
each time shaders are updated.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7078>
For (Multi)DrawArrays and (Multi)DrawElements commands, the storage size
needed are known early so we can make sure that the prim_store/vertex_store
will be big enough to store the whole command.
This reduces the amount of drawcalls in snx03 tests. For instance in test10:
| Num draw calls | GPU-load |
------|----------------|-----------------|
| Before | After | Before | After |
------|--------|-------|---------|-------|
test10| 35k | 8k | 58% | 80% |
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7078>
- xxhash is faster than sha1.
- remove superfluous calls to strlen
Using SPECviewperf13 snx-03 first subtest and "perf -e cycles -g", perf report says:
Before | After | Function
---------|--------|---------------
47.39% | 47.36% | _mesa_CallList
5.00% | 3.03% | _mesa_program_resource_location
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7078>
Merge consecutive primitives using the same mode while constructing the index buffer.
This improves performance a lot (x3 - x10) SPECviewperf13 snx-03 test by reducing the
number of draw calls per frame.
Here are some numbers for 4 of the tests:
| Num draw calls | GPU-load |
------|----------------|-----------------|
| Before | After | Before | After |
------|--------|-------|---------|-------|
test1 | 390k | 16k | 68% | 90% |
test2 | 370k | 16k | 40% | 90% |
test3 | 1.2M | 35k | 38% | 78% |
test10| 3.5M | 35k | 36% | 58% |
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7078>
Less primitive modes allows for better primitive merging.
Lines are always used (instead of picking dynamically lines or line
strips for instance) because:
- they don't need primitive restarts to be merged
- they perform better (at least on radeonsi) - SPECviewperf13 snx subtests
with lines (like 4 or 10) are 1.5x-2x faster.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7078>
external/mesa3d/src/mesa/math/m_matrix.c:1403:13: error: address of array 'mat->inv' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion]
if (mat->inv && (mat->flags & MAT_DIRTY_INVERSE)) {
~~~~~^~~ ~~
Fixes: 3175b63a0d ("mesa: don't allocate matrices with malloc")
Reviewed-by: Rob Clark <robclark@freedesktop.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7861>
PIPE_MAX_CONSTANT_BUFFERS is 32, however many Vulkan implementations
has maxPerStageDescriptorUniformBuffers that exceeds it, for example:
radv 8388606,
anv 64
nvidia 1048580 for RTX 2000 and up.
and, together with the current zink logic, the returned value
will exceed the maximum allowed value for the cap.
This causes cso_destroy_context to pass big values back to zink
(via zink_set_constant_buffer), resulting in access beyond end of
allocated buffer for all UBOs.
Cap the cap to PIPE_MAX_CONSTANT_BUFFERS (32), not INT_MAX.
Add an assert to verify future drivers.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Fixes: daaf5f1d18 ("gallium: Fix leak of currently bound UBOs at CSO context destruction.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7976>
This looks like a typo. Packed vulkan formats should always map to the
inverse order of the corresponding gallium notation. Besides, it makes
no sense that unsigned and signed formats have different ordering.
Fixes: cdfb1d925f ("zink: add last few format maps for ARB_vertex_type_2_10_10_10_rev")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7964>
Fixes the VAAPI postproc issue mentioned in this comment
(https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6736#note_626808)
without changing the height of the underlying resource when doing the
blit.
This commit removes the 0.5 pixel center offset from the compute blit - VAAPI postproc is the only function that uses this compute blit.
Fixes: 49465babdb ("frontends/va/postproc: Use the actual image height when blitting")
Signed-off-by: Thong Thai <thong.thai@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7970>
In particular, if we have an index or bindless handle we were passing
the original handle which, technically, is uniform within the context of
the if. However, we can save the back-end compiler some effort if we
pass it the result of the read_first_invocation().
(Rebased by Kenneth Graunke and Rhys Perry.)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7592>
In theory, I don't think this is a functional change. We should
generate the same code before and after.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7592>
There's no good reason why drivers that doesn't grok geometry,
tesselation or compute shaders needs to deal with them.
This fixes a crash on a lot of Piglit tests for Zink.
Fixes: daaf5f1d18 ("gallium: Fix leak of currently bound UBOs at CSO context destruction.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7971>
These failures are really weird but MSAA2x and MSAA4x work fine.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7850>
Completes freedreno gen rules migration to python3 as per meson.build
With this change all freedreno gen rules use $(MESA_PYTHON3)
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7942>
Fixes the following building error:
FAILED: ninja: 'external/mesa/src/gallium/drivers/freedreno/freedreno_log.c',
needed by 'out/target/product/x86_64/obj_x86/STATIC_LIBRARIES/libmesa_pipe_freedreno_intermediates/freedreno_log.o',
missing and no known rule to make it
Fixes: 03e7c93b82 ("freedreno: Remove fd_log()")
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7942>
Changelog:
- add freedreno_tracepoints.c.{c,h} gen rules for Android $(MESA_PYTHON3)
- update Makefile.sources with the required generated sources
Fixes the following building errors:
external/mesa/src/gallium/drivers/freedreno/freedreno_gmem.c:35:10:
fatal error: 'u_tracepoints.h' file not found
^~~~~~~~~~~~~~~~~
1 error generated.
FAILED: out/target/product/x86_64/obj/SHARED_LIBRARIES/gallium_dri_intermediates/LINKED/gallium_dri.so
...
ld.lld: error: undefined symbol: __trace_end_clear_restore
>>> referenced by freedreno_tracepoints.h:38 (out/target/product/x86_64/obj/STATIC_LIBRARIES/libmesa_pipe_freedreno_intermediates/
freedreno_tracepoints.h:38)
...
ld.lld: error: undefined symbol: __trace_start_vsc_overflow_test
>>> referenced by freedreno_tracepoints.h:272 (out/target/product/x86_64/obj/STATIC_LIBRARIES/libmesa_pipe_freedreno_intermediates
/freedreno_tracepoints.h:272)
ld.lld: error: too many errors emitted, stopping now
Fixes: a02dcb970f ("freedreno: Add GPU tracepoints")
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7942>
Android rules to build u_trace sources and u_tracepoints generated sources
Changelog:
- add util/u_tracepoints.{c,h} gen rules for Android using $(MESA_PYTHON3)
- update Makefile.sources with the required sources and generated sources
Fixes: 3471af9c6c ("gallium/aux: Add GPU tracepoint mechanism")
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7942>
This is in preparation for additional generated sources rules for Android
which will require ad hoc rules, so it is necessary to replace old ones
NOTE: pre-existing gen rules based on $(transform-generated-source) macro
are both obsolete and use of '%' pattern rule is incompatible with ad hoc
python commands for different targets
Changelog:
- remove util/u_format_srgb.c target
- replace obsolete indices/{u_indices,unfilled}_gen.c 'common' gen rules
with 'per target' gen rules using $(MESA_PYTHON3) as per meson gen rules
Fixes: 3471af9c6c ("gallium/aux: Add GPU tracepoint mechanism")
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7942>
During NIR linking, constant varyings might be moved to the next
stage and the sample qualifier removed.
shader_info::uses_sample_shading remembers if the sample qualifier
was used before optimizations.
No fossils-db changes on Sienna Cichlid.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7892>
Lavapipe exposes support for the logicOp feature, but doesn't actually
respect the state. This is easy to fix, so let's plumb it through.
This fixes spec@!opengl 1.0@gl-1.0-logicop When running with Zink on
Lavapipe.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7932>
MoltenVK does not export the vkGetPhysical*2() functionns, even in Vulkan 1.2.154.0 where the instance version moves from 1.0 to 1.1.
If the extension is present and used the KHR versions of the functions can be used.
From the spec the vkGetPhysicalDevice*2() functions should be avaiable from Vk 1.1 loaders and devices. Which implies MoltenVK might be misbehaving.
This change allows the extension to be used, if present, before the Vk 1.1 version check.
Fixes: 752f6d80 ("zink: setup version dependent VkPhysicalDeviceVulkan*Features and VkPhysicalDeviceVulkan*Properties.")
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7960>
If a subpass uses multiview but the fragment shader doesn't load it
we still have to export it.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7815>
In theory, GFX10.3 is not considered to be a conformant Vulkan
implementation because we didn't submit a conformance submission
package.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7913>