DCC/CMASK/HTILE clears will not set this. We could do a better job
at not setting this in other cases too
Image copies also don't set this.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9795>
Only prefetches set it. Unsynchronized clears and copies shouldn't set it
because syncing later wouldn't wait for the writes.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9795>
LLVM prints an error if xnack is unsupported and it uses a global stream
object that is not thread-safe. Since Mesa uses multiple threads to compile
shaders, there is a small chance that it will crash.
Just don't set any xnack options to use LLVM defaults.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4439
Cc: 20.3 21.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9795>
This should be equivalent without needed to force enable FMASK for
some specific internal pipelines.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9940>
This is no longer needed since FMASK is also compressed for
transfer dst operations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9940>
The COMPRESSION bit is FMASK and this is much faster! Should
speedup transfer dst operations with MSAA images considerably.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9990>
Apparently I inverted the sense of this flag back when we didn't have
piglit testing. Fixes terrible rendering in minetest, HL2, CS:Source, and
CS.
Fixes: 0369dd9077 ("freedreno/a6xx: Add ARB_depth_clamp and separate clamp support.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9957>
This copies code from the xml2rst to escape rST strings. Hopefully this
will be more robust than what we've done so far.
I really wish docutils would have utils for this directly, seems kinda
essential.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9917>
This should make it harder to accidentally introduce formatting issues
into the docs.
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9914>
We can build the needed structs during list compilation and then
use DrawGallium (if one draw or a single primitive mode is used) or DrawGalliumComplex
otherwise.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9533>
This fixes the ds clears path to clear only depth or stencil
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Fixes: b38879f8c5 ("vallium: initial import of the vulkan frontend")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9971>
After the previous change, PASS 1 can be trivially pulled out of the
loop.
With PASS 1 removed, the loop can be unrolled, and a lot of code can be
deleted (from the unrolls). This saves a couple lines of code, and it
makes the function a little easier to follow.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9867>
Things that are not dynamically indexed must be added last. This is
necessary so that values that are both statically indexed (or used
directly) and dynamically indexed will only be added once. With the
above change, if the constant 47 is used as a literal in an instruction
and in an array that is dynamically indexed, it will be added to
`Parameters` twice. On (really old) GPUs that store constants and other
parameters in the same storage, this can cause some valid programs to
exceed the storage limits. I don't know about R300 or NV30, but R200
was limited to something like 256 vec4s. This applies to constants,
state parameters, and local parameters (the assembly shader version of
uniforms).
The problem this causes here is that the final parameter layout created
in `_mesa_layout_parameters` may have more parameters than the input
layout. The fundamental assumption of that routine (and documented as
an assumption of `copy_indirect_accessed_array`) is that the input size
and the output size will be the same.
The affected shader had something like below. This is a common pattern
for ARB assembly shaders generated by NVIDIA's cgc compiler. As far as
I can tell, the majory of applications that use ARB assembly shaders
either use cgc or use some sort of DX9 crosscompiler... that generates
similar patterns.
PARAM c[141] = { program.local[0..133],
{ 255, 0.1, 3, 1 },
{ 0.5, 2, 0.15915491, 0.25 },
{ 0, 0.5, 1, -1 },
{ 24.980801, -24.980801, -60.145809, 60.145809 },
{ 85.453789, -85.453789, -64.939346, 64.939346 },
{ 19.73921, -19.73921, -9, 0.75 },
{ -999999 } };
The shader contains instructions like
MUL R0.x, R0, c[135].y;
and
DP4 R2.z, c[A0.x + 6], R1;
Starting with b9bff76b63, the constants at the end of `c` would get
added to `Parameters` twice. The first time they are added due to
instructions that directly access the array (e.g., the `c[135].y`
above). The second time is because they are part of an array that is
dynamically indexed. As a result, the final layout of Parameters
(calculated by `_mesa_layout_parameters`) is 7 elements larger than the
input layout.
Since bcc61a01d4 fixed the allocation size of `ParameterValues`,
`copy_indirect_accessed_array` will now write past the end of the array.
The eventually results in a crash in `free`. Thankfully Valgrind was
able to help find the real source of the problem.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Fixes: b9bff76b63 ("mesa: put constants before state vars for ARB programs")
Closes: #4505
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9867>
Ran into this while trying to rework fbconfig setup, due to a bug I
ended up trying to allocate a PIPE_FORMAT_NONE framebuffer, which failed
like you'd hope, but which we weren't converting into an error in
st_api_make_current. Instead we'd treat it like binding no drawable to
the context, which is really not what was asked for, so let's go ahead
and make this an error.
Reviewed-by: Eric Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9956>
Not sure what this was supposed to do, but whatever it did, it doesn't.
Reviewed-by: Eric Faye-Lund <kusmabite@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9956>
A650 can use the same SSBO descriptor for both 32-bit and 16-bit access,
which makes it easy to enable this extension.
Passes tests that run under:
dEQP-VK.spirv_assembly.instruction.*.16bit_storage.*
Rebased and modified commit from Jonathan Marek.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
When float16 is enabled this will allow to pass a number of
float16 tests.
When A6XX_SP_FLOAT_CNTL_F16_NO_INF is set - all operations which
generate +-infinity generate +-MAX_HALF_FLOAT.
Fixes some tests from:
dEQP-VK.spirv_assembly.instruction.*.float16.*
dEQP-VK.spirv_assembly.instruction.*.float_controls.fp16.*
E.g.:
dEQP-VK.spirv_assembly.instruction.graphics.float16.arithmetic_1.sinh_vert
dEQP-VK.spirv_assembly.instruction.compute.float16.arithmetic_4.length
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log_denorm_flush_to_zero_nostorage
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.log2_denorm_flush_to_zero_nostorage
dEQP-VK.spirv_assembly.instruction.compute.float_controls.fp16.input_args.inv_sqrt_denorm_flush_to_zero_nostorage
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
NIR has shifts defined as:
opcode("*shr", 0, tuint, [0, 0], [tuint, tuint32], False, ...
However, in ir3 we have to ensure that both operators of shift
instruction have the same bitness.
Let's hope that in future the additional COV for constants would
be optimized away.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
This matches the blob and doesn't require actually implementing controls
since the supported modes are just what the HW does.
Passes tests under:
dEQP-VK.spirv_assembly.*.float_controls.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>