the full-variable outputs can be skipped, leaving only the varyings which
actually need explicit emission due to packed layouts or whatever
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
if the number of explicit xfb outputs or new varyings added to the existing size
of the slot map would cause an overflow, we have to force a new slot map to
ensure that everything fits
this means iterating all the stages which can produce new varyings and calculating
all the slots required in order to compare against the max size available
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
if an entire variable is being dumped into an xfb buffer, there's no need
to create an explicit xfb variable to copy the value into, and instead
the xfb attributes can just be set normally on the variable
this doesn't work for geometry shaders because outputs are per-vertex
fixes all KHR-GL46.enhanced_layouts xfb tests
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
running nir_lower_io_arrays_to_elements_no_indirects for only some stages
breaks location-setting for the stages which don't run it when
e.g., dmat2x3 variables are sometimes split across locations and
sometimes jammed into a single location (TCS I'm looking at you)
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
if we get some crazy matrix types in here then we need to ensure that
we accurately unwrap them and copy the components
fixes KHR-GL46.enhanced_layouts.xfb_stride
Fixes: 1b130c42b8 ("zink: implement streamout and xfb handling in ntv")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
this was added during review, but it was never correct and just crashes
valid cases like streamout from a mat3x4 type
Fixes: b6f8f3a3ba ("zink: fix streamout for clipdistance")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
This will take effect in future patches when we are able to query the
kernel to set device->vram.size to a non-zero size.
Builds on Sagar's ("anv: Query memory region info") patch, and
re-organizes things as recommended by Lionel (and Jason).
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9324>
Just treat the llc and non-llc paths as separate cases. This will also
help when adding the local memory setup.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9324>
When tyding up this section in 1e5b09f42f ("spirv: Tidy some repeated
if checks by using a switch statement.") the break got lost. It is
not a real problem because the next case just break, but better to
have it explicitly here instead of a FALLTHROUGH.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9440>
In nouveau's PBO path with GS support and no VS layer export, we got:
intrinsic store_output (ssa_1, ssa_0) (0, 15, 0, 160, 128) /* base=0 */ /* wrmask=xyzw */ /* component=0 */ /* src_type=float32 */ /* location=0 slots=1 */ /* out_pos */
[...]
vec3 32 ssa_4 = mov ssa_3.xxx
intrinsic store_output (ssa_4, ssa_0) (0, 4, 0, 160, 128) /* base=0 */ /* wrmask=z */ /* component=0 */ /* src_type=float32 */ /* location=0 slots=1 *//* out_pos */
The mov's SSA value we would decide we could store directly to the output,
since nothing else used it. However, the store has a writemask, and the
ALU op was stomping over it instead of ANDing with the output decl's
existing writemask.
Fixes: f79f382c81 ("nir_to_tgsi: Store directly to TGSI outputs when possible.")
Closes: #4380
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9376>
This replaces the new_src parameter of nir_ssa_def_rewrite_uses_after()
with an SSA def, and rewrites all the users as needed.
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses()
with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites
all the users as needed.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
This is currently an alias for nir_ssa_def_rewrite_uses but we move all
the instances which used it to write a non-SSA source to the newly named
helper.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
While we're here, add __gen_get_batch_address declarations to more files
because we're about to start requiring it on all GFX 12.5+.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>
Add logical shift left and right operations support to mi_builder.
v1:
- Add GEN_GEN > 12 check (Jordan Justen)
- Add gen_mi_has_shift function (Jordan Justen)
- Fix commit title (Jordan Justen)
v2 (Jason Ekstrand):
- Add _imm versions of all of them
- Better handle corner-cases in _imm helpers
- Handle the power-of-two limitation for _imm versions
- Add tests
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>
The container is moved from before and hence returns size 0. To get the
correct value, the new instruction container must be used instead.
This was flagged by clang-tidy. The fixed call still triggers the
corresponding diagnostic, hence this change silences it by adding a
redundant clear() after move.
Fixes: 7f1b537304 ("aco: add new NOP insertion pass for GFX6-9")
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9432>
This pass needs to run on the last shader in a pipeline writing
gl_Position. In GLES2, that's always the vertex shader, but in ES3.2, it
can be a geometry or tessellation shader. The shared code works the same
in this case, just make the assert more generous.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9444>
Now the texture virtual memory usage is less of a problem,
we can use this workaround permanently.
In the spirit of the API it's certainly not the proper way
of implementing DYNAMIC textures (it seems they are ok
to have hidden copies in driver managed memory, but not have
virtual addressing space reduced), but it makes sense for us,
both performance wise, and to avoid bugs.
Signed-off-by: Axel Davy <davyaxel0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9377>