This reduces the size of fill operations needed to clear CMASK
for layered color textures.
GFX9 unsupported for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This reduces the size of fill operations needed to clear FMASK
for layered color textures.
GFX9 unsupported for now.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This fixes a rendering issue with RoTR/DXVK.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
In case of any enabled VS members from: uses_firstvertex,
uses_baseinstance, uses_drawid, uses_is_indexed_draw
leaks may happens.
Call gen6_upload_push_constants allocates
stage_stat->push_const_bo. It than takes pointer from
push_const_bo to draw_params_bo (in the call
brw_prepare_shader_draw_parameters by brw_upload_data)
and do reference which finally haven't got unreferenced.
Fixes leak:
136 bytes in 1 blocks are definitely lost in loss record 6 of 13
at 0x4C31B25: calloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
by 0xC2B64B7: bo_alloc_internal (brw_bufmgr.c:596)
by 0xC2B6748: brw_bo_alloc (brw_bufmgr.c:672)
by 0xC314BB3: brw_upload_space (intel_upload.c:88)
by 0xC2EBBC5: gen6_upload_push_constants (gen6_constant_state.c:155)
by 0xC9E4FA6: gen9_upload_vs_push_constants (genX_state_upload.c:3300)
by 0xC2E0EDA: check_and_emit_atom (brw_state_upload.c:540)
by 0xC2E0EDA: brw_upload_pipeline_state (brw_state_upload.c:659)
by 0xC2E0FF1: brw_upload_render_state (brw_state_upload.c:681)
by 0xC2C5D2D: brw_draw_single_prim (brw_draw.c:1052)
by 0xC2C62CB: brw_draw_prims (brw_draw.c:1175)
by 0xC488AD1: vbo_exec_vtx_flush (vbo_exec_draw.c:386)
by 0xC485270: vbo_exec_FlushVertices_internal (vbo_exec_api.c:652)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com>
Signed-off-by: Sergii Romantsov <sergii.romantsov@globallogic.com>
st/egl used to support eglCreatePbufferFromClientBuffer, but now that
it's gone, any call to it would segfault.
Let's return a nice error instead.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Nobody ever uses these, so let's just hard code them instead.
If an EGL driver ever comes around that needs them they're trivial to
re-add.
Suggested-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
The driver doesn't use these values and ac_rtld has assertions
expecting the value of 0.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We'll have to extend this at some point, and using a bitfield union in
this way makes it easier to get the right index without excessive
branching.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The initial prototype used a processor-specific symbol type, but
feedback suggests that an approach using processor-specific section
name that encodes the alignment analogous to SHN_COMMON symbols is
preferred.
This patch keeps both variants around for now to reduce problems
with LLVM compatibility as we switch branches around.
This also cleans up the error reporting in this function.
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Incrementing the iteration count was intended to fix an off-by-one error
when the first terminator was superseded by a later terminator. If
there is no first terminator or later terminator, there is no off-by-one
error. Incrementing the loop count creates one. This can be seen in
loops like:
do {
if (something) {
// No breaks or continues here.
}
} while (false);
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Tested-by: Abel Briggs <abelbriggs1@hotmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110953
Fixes: 646621c66d ("glsl: make loop unrolling more like the nir unrolling path")
We were pessimistically uploading all of it in case of indirection,
but we can just bump that when we encounter indirection.
total constlen in shared programs: 2529623 -> 2485933 (-1.73%)
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
ir3_nir_analyze_ubo_ranges() has already told us how much of cb0 we
need to upload (all of it, since it will lower indirect UBO 0 accesses
from load_ubo back to indirection on the constant buffer).
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@gmail.com>
If the NIR-level analysis decided to move UBO loads to the constant
file, but the backend decided not to load those constants, we could
upload past the end of constlen. This is particularly relevant for
pre-a6xx, where we emit a different constlen between bin and render
variants.
(Fix by Rob, commit message by anholt)
Reviewed-by: Eric Anholt <eric@anholt.net>
UBOs and uniforms now use a common code path with an explicit `index`
argument passed, enabling UBO reads.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Prevents an assert(0) later in this (not so edge) case. We still have to
have a dummy there.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We've known about this for a while, but it was never formally in the
machine header files / decoder, so let's add them in.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Now that all the counting is sorted, it's a matter of passing along a
GPU address and going.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We already uploaded UBOs, but only a fixed number (1) for uniforms;
let's upload as many as we compute we need.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We look at the highest set bit in the UBO enable mask to work out the
maximum indexable UBO, i.e. the UBO count as we need to report to the
hardware.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
We refactor panfrost_constant_buffer to mirror v3d's constant buffer
handling, to enable UBOs as well as a single set of uniforms.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
This doesn't handle Y-flipping, but it's good enough to render the stars
in Neverball.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
In preparation for lowering point sprites, track them like we track
alpha testing state.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Those either depend on information filled by the NIR linking steps OR
are restricted by those:
- gl_nir_lower_samplers: depends on UniformStorage being set by the
linker.
- brw_nir_lower_image_load_store: After 6981069fc8 "i965: Ignore
uniform storage for samplers or images, use binding info" we want
this pass to happen after gl_nir_lower_samplers.
- gl_nir_lower_buffers: depends on UniformBlocks and
SharedStorageBlocks being set by the linker.
For the regular GLSL code path, those datastructures are filled
earlier. For NIR linking code path we need to generate the nir_shader
first then process it -- and currently the processing works with all
shaders together. So move the passes out of brw_create_nir into its
own function, called by the brwProgramStringNotify and
brw_link_shader().
This patch prepares ground for ARB_gl_spirv, that will make use of NIR
linker.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The iterator `i` already walks the right amount now that is
incremented by `dmul`, so no need to `* 2`. Fixes invalid memory
access in upcoming ARB_gl_spirv tests.
Failure bisected by Arcady Goldmints-Orlov.
Fixes: b019fe8a5b "glsl/nir: Fix handling of 64-bit values in uniform storage"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
For n_columns == 1, we have a vector which is handled by the else
case. Fixes invalid memory access in upcoming ARB_gl_spirv tests.
Failure bisected by Arcady Goldmints-Orlov.
Fixes: 81e51b412e "nir: Make nir_constant a vector rather than a matrix"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>