LLVM prints an error if xnack is unsupported and it uses a global stream
object that is not thread-safe. Since Mesa uses multiple threads to compile
shaders, there is a small chance that it will crash.
Just don't set any xnack options to use LLVM defaults.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4439
Cc: 20.3 21.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9795>
This should be equivalent without needed to force enable FMASK for
some specific internal pipelines.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9940>
This is no longer needed since FMASK is also compressed for
transfer dst operations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9940>
The COMPRESSION bit is FMASK and this is much faster! Should
speedup transfer dst operations with MSAA images considerably.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9990>
Apparently I inverted the sense of this flag back when we didn't have
piglit testing. Fixes terrible rendering in minetest, HL2, CS:Source, and
CS.
Fixes: 0369dd9077 ("freedreno/a6xx: Add ARB_depth_clamp and separate clamp support.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>
This copies code from the xml2rst to escape rST strings. Hopefully this
will be more robust than what we've done so far.
I really wish docutils would have utils for this directly, seems kinda
essential.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9917>
This should make it harder to accidentally introduce formatting issues
into the docs.
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9914>
We can build the needed structs during list compilation and then
use DrawGallium (if one draw or a single primitive mode is used) or DrawGalliumComplex
otherwise.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9533>
This fixes the ds clears path to clear only depth or stencil
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Fixes: b38879f8c5 ("vallium: initial import of the vulkan frontend")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9971>
After the previous change, PASS 1 can be trivially pulled out of the
loop.
With PASS 1 removed, the loop can be unrolled, and a lot of code can be
deleted (from the unrolls). This saves a couple lines of code, and it
makes the function a little easier to follow.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9867>
Things that are not dynamically indexed must be added last. This is
necessary so that values that are both statically indexed (or used
directly) and dynamically indexed will only be added once. With the
above change, if the constant 47 is used as a literal in an instruction
and in an array that is dynamically indexed, it will be added to
`Parameters` twice. On (really old) GPUs that store constants and other
parameters in the same storage, this can cause some valid programs to
exceed the storage limits. I don't know about R300 or NV30, but R200
was limited to something like 256 vec4s. This applies to constants,
state parameters, and local parameters (the assembly shader version of
uniforms).
The problem this causes here is that the final parameter layout created
in `_mesa_layout_parameters` may have more parameters than the input
layout. The fundamental assumption of that routine (and documented as
an assumption of `copy_indirect_accessed_array`) is that the input size
and the output size will be the same.
The affected shader had something like below. This is a common pattern
for ARB assembly shaders generated by NVIDIA's cgc compiler. As far as
I can tell, the majory of applications that use ARB assembly shaders
either use cgc or use some sort of DX9 crosscompiler... that generates
similar patterns.
PARAM c[141] = { program.local[0..133],
{ 255, 0.1, 3, 1 },
{ 0.5, 2, 0.15915491, 0.25 },
{ 0, 0.5, 1, -1 },
{ 24.980801, -24.980801, -60.145809, 60.145809 },
{ 85.453789, -85.453789, -64.939346, 64.939346 },
{ 19.73921, -19.73921, -9, 0.75 },
{ -999999 } };
The shader contains instructions like
MUL R0.x, R0, c[135].y;
and
DP4 R2.z, c[A0.x + 6], R1;
Starting with b9bff76b63, the constants at the end of `c` would get
added to `Parameters` twice. The first time they are added due to
instructions that directly access the array (e.g., the `c[135].y`
above). The second time is because they are part of an array that is
dynamically indexed. As a result, the final layout of Parameters
(calculated by `_mesa_layout_parameters`) is 7 elements larger than the
input layout.
Since bcc61a01d4 fixed the allocation size of `ParameterValues`,
`copy_indirect_accessed_array` will now write past the end of the array.
The eventually results in a crash in `free`. Thankfully Valgrind was
able to help find the real source of the problem.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes: b9bff76b63 ("mesa: put constants before state vars for ARB programs")
Closes: #4505
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9867>
Ran into this while trying to rework fbconfig setup, due to a bug I
ended up trying to allocate a PIPE_FORMAT_NONE framebuffer, which failed
like you'd hope, but which we weren't converting into an error in
st_api_make_current. Instead we'd treat it like binding no drawable to
the context, which is really not what was asked for, so let's go ahead
and make this an error.
Reviewed-by: Eric Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9956>
Not sure what this was supposed to do, but whatever it did, it doesn't.
Reviewed-by: Eric Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9956>
A650 can use the same SSBO descriptor for both 32-bit and 16-bit access,
which makes it easy to enable this extension.
Passes tests that run under:
dEQP-VK.spirv_assembly.instruction.*.16bit_storage.*
Rebased and modified commit from Jonathan Marek.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
When float16 is enabled this will allow to pass a number of
float16 tests.
When A6XX_SP_FLOAT_CNTL_F16_NO_INF is set - all operations which
generate +-infinity generate +-MAX_HALF_FLOAT.
Fixes some tests from:
dEQP-VK.spirv_assembly.instruction.*.float16.*
dEQP-VK.spirv_assembly.instruction.*.float_controls.fp16.*
E.g.:
dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_1.sinh_vert
dEQP-VK.spirv_assembly.instruction.compute.float16.arithmetic_4.length
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log_denorm_flush_to_zero_nostorage
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log2_denorm_flush_to_zero_nostorage
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.inv_sqrt_denorm_flush_to_zero_nostorage
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
NIR has shifts defined as:
opcode("*shr", 0, tuint, [0, 0], [tuint, tuint32], False, ...
However, in ir3 we have to ensure that both operators of shift
instruction have the same bitness.
Let's hope that in future the additional COV for constants would
be optimized away.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
This matches the blob and doesn't require actually implementing controls
since the supported modes are just what the HW does.
Passes tests under:
dEQP-VK.spirv_assembly.*.float_controls.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
cat1 instructions round to zero by default.
When fp16 is enabled this will fix:
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage_frag
dEQP-VK.spirv_assembly.instruction.graphics.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage_vert
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.rounding_rte_conv_from_fp32_nostorage
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
Broken out of VK_GOOGLE_display_timing patch
Cc: stable
Co-author: Jakob Bornecrantz <jakob@collabora.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Keith Packard <keithp@keithp.com>
Signed-off-by: Jakob Bornecrantz <jakob@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9939>
Per OpenGL 4.6 spec:
"If no xfb_stride qualifier is specified for a
binding point, the stride is derived by identifying the variable associated with the
binding point having the largest offset, and then adding the offset and the size of
the variable, in basic machine units. If any variable associated with the binding
point contains double-precision floating-point components, the derived stride is
aligned to the next multiple of eight basic machine units. If a binding point has no
xfb_stride qualifier and no associated output variables, its stride is zero."
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
1) Per GL_ARB_enhanced_layouts if explicit location is set for varying,
each struct member, array element and matrix row will take
separate location. With GL_ARB_gpu_shader_fp64/GL_ARB_gpu_shader_int64
they may take two locations.
Examples:
| layout(location=0) dvec3[2] a; | layout(location=4) vec2[4] b; |
| | |
| 32b 32b 32b 32b | 32b 32b 32b 32b |
| 0 X X Y Y | 4 X Y 0 0 |
| 1 Z Z 0 0 | 5 X Y 0 0 |
| 2 X X Y Y | 6 X Y 0 0 |
| 3 Z Z 0 0 | 7 X Y 0 0 |
Previously it wasn't taken into account.
2) Captured double-precision variables should be aligned to
8 bytes per GL_ARB_gpu_shader_fp64:
"If any variable captured in transform feedback has double-precision
components, the practical requirements for defined behavior are:
...
(c) each double-precision variable captured must be aligned to a
multiple of eight bytes relative to the beginning of a vertex."
v2: fix `output_size` calculations
( Andrii Simiklit <andrii.simiklit@globallogic.com> )
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1667
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
When packing varyings when there is only 32bit of space
left in slot 64bit type is attempted to be divided between
current and next slot. However there is neither code for
splitting the 64bit type nor for assembling it back.
Instead we add 32bit padding.
The above happens only in structs because their
members aren't sorted by packing order.
Example of the issue:
struct S {
vec3 a;
double d;
};
out flat S s;
Before, the packing went as:
slot 32b 32b 32b 32b
0 a.x a.y a.z d
1 d 0 0 0
After:
slot 32b 32b 32b 32b
0 a.x a.y a.z 0
1 d d 0 0
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Danylo Piliaiev <danylo.piliaiev@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2333>
The merged image contains kernels & rootfs for both arm64 & armhf
baremetal test jobs, and is smaller than either arm{64,hf}_test image
before.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9955>
Doing so in an x86 container via qemu was slow, and started failing
recently after updating to a newer qemu version.
This also results in smaller arm*_test* docker images, since we need to
install fewer Debian packages in them.
As a bonus, this turns some piglit tests from fail to pass (Or maybe
they'll turn out to be flakes? They've passed at least 3 times in a
row).
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9955>
We use the CI-built kernel+rootfs these days. I haven't bumped image tags
because the files are definitely unused, and I'm rebuilding it all in the
next commit.
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9955>
If the window is destroyed on a thread that has a currently-bound
context, use that context for destroying the framebuffer. This
ensures that the winsys can wait for in-flight work before
destroying any resources.
If the window did have a context bound beforehand but it was unbound,
we should've already done a glFinish. If the window is destroyed from
an unrelated thread... well, we're screwed, but that's the best we can do.
Reviewed-By: Bill Kristiansen <billkris@microsoft.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9959>