This new mode will be only used for the actual payload variables and
not the number of launched mesh shader workgroups, which will still
be treated as an output.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14930>
On some systems, these macros are already defined, and being defined
slightly differently causes them to emit redefinition-warnings.
Let's wrap them in ifdefs to avoid the warnings.
Acked-by: Caio Oliveira <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15084>
The old calculation for mat3 was clever, but it turns out that a
straightforward application of subdeterminants similar to how mat4 is
handled is more efficient: on a scalar architecture with some sort of
combined multiply+add instruction with a negate modifier (both fairly
common), the new determinant is 9 instructions vs. 15 for the old one,
and without the multiply-add it's 14 instructions vs. 18 for the old
one. When used as a routine for inverse() the savings are compounded,
because we now use the same method as used to compute the adjucate
matrix and so CSE can combine most of the calculations with the adjucate
matrix ones.
Once mat3 and mat4 use the same method for computing determinants, we
can combine them into a single recursive function. I also pulled up the
mat_subdet() function because it was doing basically what we need, so
it's now shared between determinant and inverse. This shrinks the
implementation significantly, as can be seen from the diffstat.
The real reason I want to change this, though, is that it fixes
dEQP-VK.glsl.builtin.precision_fp16_storage16b.inverse.compute.mat3 with
turnip. Qualcomm uses round-to-zero for 16-bit frcp, which combined with
some inaccuracy in the old method of calculating the determinant led us
to fail. Qualcomm's driver uses something like the new method to
calculate the determinant in the inverse. We could argue that Mesa's
method should be allowed, because round-to-zero for floating-point
division is within spec and there are no precision guarantees given for
determinant() or inverse(). However we might as well use the more
efficient method.
Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14652>
The shader can have SpvOpWritePackedPrimitiveIndices4x8NV while the
output variable may not exist. This seems to be a defect in the
NV_mesh_shader SPIR-V spec, let's work around it by creating the
variable on-demand.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15005>
NOTE: This commit needs "nir: All set-on-comparison opcodes can take all
float types" or regressions will occur in other Vulkan SPIR-V tests.
No shader-db changes on any Intel platform.
NOTE: This commit depends on "nir: All set-on-comparison opcodes can
take all float types".
v2: Fix handling 16-bit (and presumably 64-bit) values.
About 280 shaders in Talos are hurt by a few instructions, and a couple
shaders in Doom 2016 are hurt by a few instructions.
Tiger Lake
Instructions in all programs: 159893290 -> 159895026 (+0.0%)
SENDs in all programs: 6936431 -> 6936431 (+0.0%)
Loops in all programs: 38385 -> 38385 (+0.0%)
Cycles in all programs: 7019260087 -> 7019254134 (-0.0%)
Spills in all programs: 101389 -> 101389 (+0.0%)
Fills in all programs: 131532 -> 131532 (+0.0%)
Ice Lake
Instructions in all programs: 143624235 -> 143625691 (+0.0%)
SENDs in all programs: 6980289 -> 6980289 (+0.0%)
Loops in all programs: 38383 -> 38383 (+0.0%)
Cycles in all programs: 8440083238 -> 8440090702 (+0.0%)
Spills in all programs: 102246 -> 102246 (+0.0%)
Fills in all programs: 131908 -> 131908 (+0.0%)
Skylake
Instructions in all programs: 134185495 -> 134186618 (+0.0%)
SENDs in all programs: 6938790 -> 6938790 (+0.0%)
Loops in all programs: 38356 -> 38356 (+0.0%)
Cycles in all programs: 8222366923 -> 8222365826 (-0.0%)
Spills in all programs: 98821 -> 98821 (+0.0%)
Fills in all programs: 125218 -> 125218 (+0.0%)
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Fixes: 1feeee9cf4 ("nir/spirv: Add initial support for GLSL 4.50 builtins")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13999>
This creates an internal shader_prim enum, I've fixed up most
users to use it instead of GL types.
don't store the enum in shader_info as it changes size, and confuses
other things.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
To avoid dragging gl.h into places it has no business being,
defined tessellation primitive mode to an enum.
This has a lot of fallout all over the place.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14605>
GL_LINE_STRIP_ADJACENCY is 0xB. 0xA is GL_LINES_ADJACENCY. While we're
here, drop the ARB prefixes.
Reviewed-by: Caio Oliveira <caio.oliveira@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14157>
spirv_to_nir sometimes wraps derefs in vec2 or mov instructions as part of
its texture handling. These get in the way of
nir_rematerialize_derefs_in_use_blocks_impl. Running copy propagation
should get rid of the extra move instructions and get us back to intact
deref chains for everything except variable pointer use-cases.
fossil-db (Sienna Cichlid):
Totals from 6 (0.00% of 134572) affected shaders:
CodeSize: 92656 -> 93088 (+0.47%)
Instrs: 17060 -> 17138 (+0.46%)
Latency: 224408 -> 227539 (+1.40%)
InvThroughput: 37402 -> 37924 (+1.40%)
VClause: 408 -> 402 (-1.47%)
Copies: 1065 -> 1107 (+3.94%)
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5668
Fixes: 14a12b771d ("spirv: Rework our handling of images and samplers")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13924>
This fixes CLs 1.2 1Darray and 2Darray images.
Fixes: 589d918a4f
("spirv: Add 0.5 to integer coordinates for OpImageSampleExplicitLod")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13401>
Instead of using gsamplerND types for sampled images, use the new
gtextureND types for sampled images and reserve gsamplerND for combined
image+samplers. Combined image+sampler bindings still get a gsamplerND
type.
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13389>
Use the new nir_var_mem_image mode for images that are not known to be
used with a sampler (i.e. storage images).
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4743>
Now that all drivers have been patched to convert sysvals to input
varyings when they have too, we can safely declare PointCoord as a sysval
too.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13017>
Now that all spirv_to_nir() users take care of converting sysvals to
varyings, we can unconditionally declare FragCoord as a sysval.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13017>
No shader-db or fossil-db changes on any Intel platform.
v2: Keep the flt <-> fge switcharoo local to the SpvOpFUnordLessThan,
etc. handling. Add a comment explaining why the suboptimal
SpvOpFUnordEqual implementation is used here. Suggested by Caio.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12320>
If an OpVariable's initializer is undef, there is no need to
initialize the variable.
v2: Comment the code (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13030>
The LLVM-SPIRV translator creates variables with initializers, but
most of those are actually undef initializers. We can just skip
composites that are entirely made of undefs, but for partially undefs,
we will still zero initialize.
v2: Rename wa_llvm_spirv_undef_initializer to wa_llvm_spirv_ignore_workgroup_initializer (Caio)
Limit workaround to OpenCL (Caio)
Make workaround clearer (Caio)
v3: Only apply workaround on workgroup storage (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13030>
Since only Anv uses the value, I'm only enabling this on anv.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 518693c3ec ("spirv: Handle the SubgroupSize execution mode")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13034>
v2 (Ivan): Add missing capability enum handling.
v3 (idr): Properly handle cases where dest_size != 32.
v4 (idr): Rewrite most of the error checking to use vtn_fail_if. Use
nir_ssa_def with vtn_push_nir_ssa instead of vtn_ssa_value with
vtn_push_ssa_value. All suggested by Jason. Massive rewrite of the
handling of packed 4x8 saturating opcodes. Based on some observations
made by Jason.
v5 (idr): Remove some debugging cruft accidentally added in v4. Noticed
by Jason.
v6: Emit packed versions of vectored instructions when possible.
Suggested by Jason.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12142>
This shouldn't do anything but will make testing a later patch easier.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8056>