If the pipeline struct isn't found, the state might still be dynamic.
Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10193>
When formatting the error here, we're currently casting an
ast_type_qualifier as a string.
But we don't need to use a string here at all, because we know from
context exactly what qualifier we're talking about, because the
if-statements explicitly check for the uniform-qualifier.
So let's just hard-code the format-string to reference the right
qualifier instead of the string-shenanigans. The latter cannot do the
right thing.
Fixes: 2d03f48a65 ("glsl: Add parsing for GLSL uniform blocks.")
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9911>
This arugment is not present int the format-string, so we shouldn't pass
it to _mesa_glsl_error either.
Noticed by Coverity.
Fixes: 02dc74fbd7 ("glsl: parse invocations layout qualifier for ARB_gpu_shader5")
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9911>
Let's just say that this passes dEQP.
If you think the _mesa_float_to_half conversions are costly, you can
enable FP16 uniforms only if the CPU supports F16C, which is fast.
Drivers will control whether this is used, not common code.
ARM will need something that is equivalent to F16C.
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9050>
It is useful for the precisions of varyings to match across shader
stages at link-time to enable precision lowering optimizations, which
would otherwise require costly draw-time fixups.
The goal is to enable `producer->precision == consumer->precision` to be
an invariant drivers may rely on for linked shaders.
v2: keep transform feedback outputs at mediump - mareko
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> (v1)
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9050>
Added:
* a pass that renumbers bases of IO intrinsics
* a pass that converts mediump IO to 16 bits, optionally using the new
packed varying slots
* a pass that sets (forces) mediump in IO intrinsics (for testing)
* a pass that remaps VARYING_SLOT_VAR[0..15]_16BIT to VARYING_SLOT_VAR[0..31]
(if some shader stages don't want packed varyings)
* a pass that folds type conversions around texture opcodes into those
opcodes (e.g. tex(f2f32(coord), ..) is changed into tex accepting f16)
* a pass that changes (legalizes) sampler src and dst types based on specified
hw constraints (e.g. derivatives must be the same type as coordinates)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9050>
This allows mediump inputs and outputs to be trivially lowered into packed
16-bit varyings where 1 slot is occupied by 2 16-bit vec4s, without any
packing instructions in NIR and without any conflicts with 32-bit varyings.
The only thing that is changed is IO semantics in intrinsics to get packed
16-bit varyings.
This simplifies supporting 16-bit types for drivers that have 32-bit slots
everywhere except the fragment shader where they can do 16-bit interpolation
on either the low or high half of each slot.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9050>
This breaks CLOn12's handling of CL CTS test_basic vector_creation for char3 (at least).
Removing this cast causes us to try to load from a deref with no alignment info.
Fixes: 99bb2a4d ("nir/opt_deref: Don't remove casts with alignment information")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10165>
The current handling for SPIR-V memory semantics is very specific to
the wording in the SPIR-V spec, which breaks its handling of OpenCL
(compared to what we had working downstream before merging upstream).
Update/relax the logic here to support CL's barrier(CLK_GLOBAL_MEM_FENCE);
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10165>
MSAA 4x and 8x should only clear the first 2 samples because other samples
are uncompressed. The compute shader only clears that subset of DCC.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
The retile map is removed and replaced by direct DCC address computations
in the retile shader using the new function ac_nir_dcc_addr_from_coord.
The RADV code is disabled.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
The test takes over 2 minutes on a 12C/24T CPU with OpenMP.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
This fixes an addrlib failure on gfx9.
Fixes: b43f40166c "ac/surface: select best swizzle mode for 3D sampler performance"
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
This adds a clear_buffer compute shader that does read-modify-write to
update a subset of bits in HTILE.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
Set the final value in si_texture_create_object, so that other places
don't have to derive it redundantly.
The only thing to remember is that HTILE stencil can be enabled when
stencil is not present, and it can be disabled when stencil is present
due to various workarounds.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
Fast clears are only used for level 0. This enables clearing level 0
of CMASK and DCC on gfx10+ when there are multiple mipmap levels.
vi_dcc_clear_level can also clear any level now.
Mipmapped array textures are still cleared slowly.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
FMASK is usually pretty large. It's better to leave the cache to shaders.
FMASK stores are still cached, but they can be evicted sooner, which is
the same as other color stores. Only DCC, HTILE, and CMASK are cached.
I haven't benchmarked this, but it seems like the right thing to do.
This only affected APUs.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
With a resolution of 1600x1200, I measured FPS increases in:
* glxgears 18.04% +/- 0.65% (n=691)
* Nexuiz 3.58% +/- 0.09% (n=553)
compared to the master branch at commit
3f614c6f7c.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9230>