Cached shaders require relinking, so hardcoding the pointer
can't work. This switches out the printf code to use new
proper API.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
When using cached shaders we have to relink the shader with
external symbols when it's loaded. However the way gallivm does
function calls now hardcodes the function pointer into the shader.
LLVM had a mechanism for doing this properly using global mappings,
this switches the coroutine alloc/free code to use a global mapping.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
When we use an object cache for the MCJIT we can have identical
cache entries from the same shader variant in different shaders,
but the JIT objcache uses the function name to relink things,
so it has to be consistent. Just drop the variants from the
function names.
Note the modules still have the variant info.
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5049>
Need to list arm_test-base here as well, or jobs using this
template may spuriously run if the arm_test-base job fails or
is cancelled.
Suggested-by: Michel Dänzer <mdaenzer@redhat.com>
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5405>
This reduces overhead when resizing windows or when allocating
similar image sizes over and over again.
v2: optimize the memory footprint of the cache
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
The retile map is not used in this case, and the retile map computation
takes 39% of CPU time when resizing a window.
This brings it down to 23%.
The dcc_retile_use_uint16 setting has to be derived from DCC sizes.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>
16-bit abs/neg creates v_xor_b32/v_and_b32 with v2b definitions. These
instructions never do partial writes without SDWA.
No shader-db changes.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5040>
This uses the new nir_intrinsic_store_combined_output_pan intrinsic,
which can write depth, stencil and color in a single instruction. If
there are no color writes, the "depth RT" is written to.
Fixes the dEQP GLES3 depth write tests, as well as the piglit tests
fragdepth_gles2, glsl-1.10-fragdepth and when modified to not rely
on depth/stencil reload, glsl-fs-shader-stencil-export.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
We schedule depth writeout to smul and stencil to vlut, so scheduling
to smul has to be disabled in these cases.
When only writing stencil, scheduling to smul is still disabled to
prevent stencil writeout from being scheduled there.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
Depth and stencil writes are combined with color writes, so we need
this intrinsic which has sources for color, RT, depth and stencil.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
By setting the swizzle for the fragment color, and setting qmask to ~0
for branches, the special case for writeout branches can be removed
from mir_bytemask_of_read_components_index.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
We need to be able to do color writeout at the same time as depth
writeout. The old code can't do that, so needs to be removed.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
It is broken for when there are also color writes, and will be
replaced with a new lowering which takes that into account.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
There will need to be sources for depth and stencil writeout, so
something has to be moved to the dest of the writeout branch.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5065>
This is needed otherwise it can cause bad rendering of UYVY files.
The align(..., 256 / surf->bpe) constraint comes from addrlib.
Fixes: 69aadc4933 ("radeonsi: fix surf_pitch for subsampled surface")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5314>
If gsprim_lds_size is larger than target_lds_size then gfx10_ngg_calculate_subgroup_info
will fail.
This commit adds a logic to try the multi-cycling in this case because it's
using less memory.
This fix glsl-1.50-gs-max-output when using NGG.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5401>
In case shader contains two equal macro defines, first one with trailing spaces
and the second one without.
`#define A 1 `
`#define A 1`
The parser crashes
Fixes: 0346ad3774 ("glsl: ignore trailing whitespace when define redefined")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5312>
In d11e4738a8, we started using a start_offset to allow us to
allocate pools where the base address isn't at the start of the pool.
This is useful for binding table pools which want to be relative to
surface state base address (more or less), among other things. However,
we had a bug where, if you have a negative offset, everything returned
to the pool would end up being returned to the "back" of the pool. This
isn't what we want for binding tables in the softpin world. This was
causing us to never actually re-use any binding table blocks. How this
passed CTS, I have no idea.
Closes: #3100
Fixes: d11e4738a8 "anv/allocator: Add a start_offset to anv_state_pool"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5395>
Even through the resouce requested has a BIND_SCANOUT or related tag,
this does not mean that we have a render-only driver.
This can trivially happen as one requests such resource from GBM, while
using the panfrost fd (and hence panfrost_dri.so)
Forward port of !3000
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Robert Foss <robert.foss@collabora.com>
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Closes: #2664
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5410>
I think this case was just missing.
This fixes a bunch of 16-bit storage related CTS failures like
dEQP-VK.ssbo.phys.layout.single_basic_type.std430.u16vec4.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>
Use v_bfe to implement small bitsize conversions because the
compiler probably optimizes this better.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5226>