nir->info.has_transform_feedback_varyings is set for all stages in the
pipeline when xfb is present, so it can't be used for this
harmless, but this is more correct
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17404>
this splits all the members of a struct into separate variables to
improve xfb inlining and reduce the number of locations consumed by
xfb outputs, reducing the chances of running out of shader outputs
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17404>
get_slot_components() returns the total number of output components
for arrays for initial evaluation phase, but during the packed->inlined
conversion the arrayed size must be normalized to the slot's component count
in order to effectively catch and inline the array
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17404>
glsl_get_explicit_size can return non-16 byte aligned sizes.
Therefore, make sure the sure the size isrounded up so that OOB does not happen.
Fixes: ea8a0654f5 ("zink: further improve bo sizing")
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17460>
this eliminates (some) out-of-bounds bo access and will ensure that
bo sizing is always accurate by breaking all the cases where it isn't
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17239>
using the attribute slot size isn't sufficient in this case, as the layout
rules may have additional effects upon sizing
instead, just use the explicit size
cc: mesa-stable
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17239>
I had this in at one point to fix something, but now it somehow just
breaks fbfetch instead of fixing anything
cc: mesa-stable
fixes:
dEQP-GLES31.functional.blend_equation_advanced*
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17254>
this avoids the scenario where the full bo size isn't accounted for because
no variable for the block has been created
cc: mesa-stable
affects:
KHR-GL33.shaders.uniform_block.random.all_per_block_buffers.3
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17217>
deleting the generated shader on the first loop iteration like this was
broken if the shader was used in multiple programs, so delete at the end
cc: mesa-stable
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17010>
usually inlining is optimal for cpu drivers since the majority of
time is spent in the shaders, and any amount of reduction to shader code
will be optimal
if, however, the shaders are still really big after inlining, this improvement
will be negated by the insane amount of time spent doing stupid llvm optimizer
passes, so check post-inline size to see whether it exceeds a size threshold
lavapipe release build - 1700% improvement
* spec@arb_tessellation_shader@execution@variable-indexing@tcs-output-array-vec4-index-rd-after-barrier
before: 142.15s user 0.42s system 99% cpu 2:23.14 total
after: 8.60s user 0.07s system 99% cpu 8.677 total
fixes#6647
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16977>
this was correct for 64bit loads and manually converted 32bit loads (e.g., bindless),
but it was broken for the case where 64bit was not supported, as the offset wasn't
being correctly adjusted
break out the offset division to hopefully make this a little clearer
Fixes: 150d6ee97e ("zink: move all 64-32bit shader load rewriting to nir pass")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16669>
all of ntv requires scalarized io since the offsets are now array indices
instead of byte offsets, so enforce scalarization here to avoid breaking
the universe
Fixes: 150d6ee97e ("zink: move all 64-32bit shader load rewriting to nir pass")
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16669>
zink can't use lower_io, so this all has to be done manually and in
excruciating depth and detail
fixes (tu):
KHR-Single-GL46.arrays_of_arrays_gl.InteractionFunctionCalls2
KHR-GL46.gpu_shader_fp64.fp64.named_uniform_blocks
KHR-GL46.gpu_shader_fp64.fp64.varyings
KHR-GL46.vertex_attrib_binding.advanced-bindingUpdate
KHR-Single-GL46.enhanced_layouts.varying_array_components
KHR-Single-GL46.enhanced_layouts.varying_array_locations
KHR-Single-GL46.enhanced_layouts.varying_components
KHR-Single-GL46.enhanced_layouts.varying_locations
KHR-Single-GL46.enhanced_layouts.xfb_explicit_location
dEQP-GLES3.functional.transform_feedback.basic_types.interleaved.lines.highp_mat3x4
dEQP-GLES3.functional.transform_feedback.basic_types.interleaved.lines.mediump_vec3
dEQP-GLES3.functional.transform_feedback.basic_types.interleaved.points.mediump_vec4
dEQP-GLES3.functional.transform_feedback.basic_types.separate.points.highp_vec3
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16669>
VVL is great, but there's actually cases where it doesn't catch critical
spirv errors, so add in our own validation pass to make sure things are
okay
this is especially useful for running on nvidia, as their compiler will
either crash on or silently drop illegal instructions
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16462>
this compacts all buffers in the shader into an array that can be
used in a single descriptor, thus handling the case of indirect indexing
while also turning constant indexing into indirect (with const offsets)
since there's no sane way to distinguish
a "proper" implementation of this would be to skip gl_nir_lower_buffers
and nir_lower_explicit_io altogether and retain the derefs, but that would
require a ton of legwork of other nir passes which only operate on the
explicit io intrinsics
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15906>
nvidia can't do this, but also nothing uses it, so I've gone ahead and
done the bare minimum here to make cts pass
I think the work to do the shader rewrites should be easy, but without a test
case, I see no point in spending the time for it
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16100>
This controls the whole lowering of "make tex ops with implicit
derivatives on non-implicit-derivative stages be tex ops with an explicit
lod of 0 instead", but it's really hard to describe that in a git commit
summary.
All existing callers get it added except:
- nir_to_tgsi which didn't want it.
- nouveau, which didn't want it (fixes regressions in shadowcube and
shadow2darray with NIR, since the shading languages don't expose txl of
those sampler types and thus it's not supported in HW)
- optional lowering passes in mesa/st (lower_rect, YUV lowering, etc)
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16156>