Co-authored-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Daniel Schürmann <daniel.schuermann@campus.tu-berlin.de>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The SPIR-V extension wants us to be able to do an AllEqual on any vector
or scalar type. This has two implications:
1) We need to be able to handle vectors so we switch the vote_eq
intrinsics to be vectorized intrinsics.
2) We need to handle floats which have different behavior with respect
to +-0, NaN, etc. than the integer variant so we need two variants.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Someone can make the lowering optional later if they want something
different for their hardware.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
This will be required for SPIR-V subgroup support
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Our previous handling of barriers always used the big hammer and didn't
correctly emit memory barriers when specified along with a control
barrier. This commit completely reworks the way we emit barriers to
make things both more precise and more correct.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
When looking up known glsl_type instances in the various hash tables, we
end up leaking the key instances used for the lookup, as the glsl_type
constructor allocates memory on the global mem_ctx. This patch changes
glsl_type to manage its own memory, which fixes the leak and also allows
getting rid of the global mem_ctx and its mutex.
v2: remove lambda usage (Tapani)
(+keep ASSERT_BITFIELD_SIZE, modify dummy ctor to initialize mem_ctx)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104884
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Simon Hausmann <simon.hausmann@qt.io>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This capability allows gl_ViewportIndex and gl_Layer to also be used
as outputs in Vertex and Tesselation shaders.
v2: Make conditional to the capability, add gl_Layer, add tesselation
shaders. (Iago)
v3: Don't export to tesselation control shader.
v4: Add Reviewd-by tag.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
This reverts commit 2d36efdb7f.
This raised limit turns out to harmful for more complex shaders,
it causes excessive spilling in some Bioshock Infinite shaders.
The fps for the ssao demo on radv remains unchanged when reverting
this.
Reviewed-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
These transformations are inexact because section 4.7.1 (Range and
Precision) says:
Operations and built-in functions that operate on a NaN are not
required to return a NaN as the result.
The fmin or fmax might not return NaN in cases where the original
expression would be required to return NaN.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This transformation is inexact because section 4.7.1 (Range and
Precision) says:
Operations and built-in functions that operate on a NaN are not
required to return a NaN as the result.
The fmin or fmax might not return NaN in cases where the original
expression would be required to return NaN.
v2: Reorder operands and mark as inexact. The latter suggested by
Jason.
shader-db results:
Haswell, Broadwell, and Skylake had similar results. (Skylake shown)
total instructions in shared programs: 14514817 -> 14514808 (<.01%)
instructions in affected programs: 229 -> 220 (-3.93%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4
helped stats (rel) min: 2.86% max: 4.12% x̄: 3.70% x̃: 4.12%
total cycles in shared programs: 533145211 -> 533144939 (<.01%)
cycles in affected programs: 37268 -> 36996 (-0.73%)
helped: 8
HURT: 0
helped stats (abs) min: 2 max: 134 x̄: 34.00 x̃: 2
helped stats (rel) min: 0.02% max: 14.22% x̄: 3.53% x̃: 0.05%
Sandy Bridge and Ivy Bridge had similar results. (Ivy Bridge shown)
total cycles in shared programs: 257618409 -> 257618403 (<.01%)
cycles in affected programs: 12582 -> 12576 (-0.05%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.05% max: 0.05% x̄: 0.05% x̃: 0.05%
No changes on Iron Lake or GM45.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reduces my build from 2075 warnings to 2023 warnings by silencing 52
instances of things like
src/compiler/nir/nir_constant_expressions.c: In function ‘evaluate_bfi’:
src/compiler/nir/nir_constant_expressions.c:1812:61: warning: unused parameter ‘bit_size’ [-Wunused-parameter]
evaluate_bfi(MAYBE_UNUSED unsigned num_components, unsigned bit_size,
^~~~~~~~
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
var->name could be NULL under ARB_gl_spirv for example. And in any
case, the code is already handing var name being NULL when reading a
variable, so it is consistent to do it writing a variable too.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
The introduction of 16-bit types with VK_KHR_16bit_storages implies that
push constant offsets could be multiple of 2-bytes. Some assertions are
updated so offsets should be just multiple of size of the base type but
in some cases we can not assume it as doubles aren't aligned to 8 bytes
in some cases.
For 16-bit types, the push constant offset takes into account the
internal offset in the 32-bit uniform bucket adding 2-bytes when we access
not 32-bit aligned elements. In all 32-bit aligned cases it just becomes 0.
v2: Assert offsets to be aligned to the dest type size. (Jason Ekstrand)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Range in 16-bit push constants load was being calculated
wrongly using 4-bytes per element instead of 2-bytes as it
should be.
v2: Use glsl_get_bit_size instead of if statement
(Jason Ekstrand)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
According to GLSL ES 3.2 spec, see table in 9.2.1 "Linked Shaders"
section, the precision qualifier should match for uniform variables.
This also applies to previous GLSL ES 3.x specs.
This 'if' checks the condition for uniform variables, while for UBOs
it is checked in link_interface_blocks.cpp.
Fixes: b50b82b8a5
("glsl/es31: precision qualifier doesn't need to match in shader interface block members")
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
From the GLSL 4.60 spec Section 5.9 (Expressions):
"Dividing by zero does not cause an exception but does result in
an unspecified value."
Fixes: 89285e4d47 "nir: add new constant folding infrastructure"
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105271
In order to fix a build failure on compilers not implementing
unrestricted unions, which is a C++11 feature.
v2: Provide signed integer comparison and assignment operators instead
of BITSET_WORD ones to avoid spurious ambiguity warnings on
comparisons with a signed integer literal.
Fixes: ba79a90fb5 "glsl: Switch ast_type_qualifier to a 128-bit bitset."
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105238
Tested-by: Roland Scheidegger <sroland@vmware.com>
Tested-By: George Kyriazis <george.kyriazis@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
This requires passing an extra argument to the lowering pass because
the KHR_blend_equation_advanced specification doesn't seem to define
any mechanism for the implementation to determine at compile-time
whether coherent blending can ever be used (not even an "#extension
KHR_blend_equation_advanced_coherent" directive seems to be required
in the shader source AFAICT).
In the long run we'll probably want to do state-dependent recompiles
based on the value of ctx->Color.BlendCoherent, but right now there
would be no benefit from that because the only driver that supports
coherent framebuffer fetch is i965 on SKL+ hardware, which are unable
to support the non-coherent path for the moment because of texture
layout issues, so framebuffer fetch coherency is always enabled for
them.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
This allows the application to request framebuffer fetch coherency
with per-fragment output granularity. Coherent framebuffer fetch
outputs (which is the default if no qualifier is present for
compatibility with older versions of the EXT_shader_framebuffer_fetch
extension) will have ir_variable_data::memory_coherent set to true.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
At the same point where it is initialized on GL(ES) 3.0+ so we can
implement some common layout qualifier handling in a future commit.
Until now the fb_fetch_output flag would be inherited from the
original implicit gl_LastFragData declaration at a later point in the
AST to GLSL IR translation.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
This should end the drought of bits in the ast_type_qualifier object.
The bitset_t type works pretty much as a drop-in replacement for the
current uint64_t bitset.
The only catch is that the bitset_t type as defined in the previous
commit doesn't have a trivial constructor (because it has a
user-defined constructor), so it cannot be used as union member
without providing a user-defined constructor for the union (which
causes it in turn to be non-trivially constructible). This annoyance
could be easily addressed in C++11 by declaring the default
constructor of bitset_t to be the implicitly defined one -- IMO one
more reason to drop support for GCC 4.2-4.3.
The other minor change was required because glsl_parser_extras.cpp was
hard-coding the type of bitset temporaries as uint64_t, which (unlike
would have been the case if the uint64_t had been replaced with
e.g. an __int128) would otherwise have caused a build failure, because
the boolean conversion operator of bitset_t is marked explicit (if
C++11 is available), so the bitset won't be silently truncated down to
1 bit in order to use it to initialize the uint64_t temporaries
(yikes).
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
Similar for the 4 case.
Suggested by Bas.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Otherwise the code size increases because the original fexp2()
instructions can't be deleted.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Both KHR_blend_equation_advanced and ARB_bindless_texture provide
layout qualifiers, and are exposed in compatibility contexts. We
need to parse the layout qualifier as a token in order for those
to work, but forgot to extend this check.
ARB_shader_image_load_store would need a similar treatment, but we
don't expose that in legacy OpenGL contexts.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105161
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
This was originally intended to make sure the remap location
was not -1. However the code has changed alot since then,
the location is now never set to -1 and we also handle
components meaning this old assert has been doing comparisions
with the pointer to the array of component data.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105183
Fixes: d32956935e ("glsl: Walk a list of ir_dereference_array to mark array elements as accessed")
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
They should probably get unit tests implemented, but this cleans up a
bunch of warnings in my build for now.
Fixes: 59f458cd87 ("glsl: Add 16-bit types")
Cc: Eduardo Lima Mitev <elima@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Doesn't really change anything to the test though ¯\_(ツ)_/¯
CID: 1429511
Fixes: e8495646af "glsl/tests: changes to test_disk_cache_create test"
Signed-off-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Let's use the new gl_state_index16 type everywhere and remove
the typecasts.
This helps reduce the size of gl_program_parameter.
Reviewed-by: Brian Paul <brianp@vmware.com>
This is already handled at link_uniform_blocks, specifically at
process_block_array_leaf.
Additionally, this code was not handling correctly arrays of
arrays. When creating the name of the block to set the binding, it
only took into account the first level, so any attempt to set a
explicit binding on a array of array ubo would trigger an assertion.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Add a build option to control building some of the misc tools we
have. Also set the executables to install, presumably you want
that if you're asking for the build.
v2: set 'install:' to the with_tools value, not true (Jordan)
handle 'all' in a the comma list (Dylan)
Add freedreno's tools (Dylan)
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Next patch will allow disk_cache instance to be created without
path set for it, modify some test cases that assume disk_cache
creation to fail with invalid path. Creation should succeed but
simple put/get test fail.
v2: leave tests as is but check that both cache struct exists
and try simple put/get that should fail with invalid path set
(Emil)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Patch moves functions higher so that we can utilize them from
test_disk_cache_create which is modified by next patch.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
LLVM can't shrink loads.
Polaris10:
Totals from affected shaders:
SGPRS: 62528 -> 59955 (-4.11 %)
VGPRS: 44708 -> 44616 (-0.21 %)
Spilled SGPRs: 16 -> 8 (-50.00 %)
Code Size: 1355504 -> 1355172 (-0.02 %) bytes
Max Waves: 11710 -> 11670 (-0.34 %)
Vega10:
Totals from affected shaders:
SGPRS: 51448 -> 50371 (-2.09 %)
VGPRS: 39140 -> 39048 (-0.24 %)
Spilled SGPRs: 16 -> 16 (0.00 %)
Code Size: 1307188 -> 1304296 (-0.22 %) bytes
Max Waves: 11312 -> 11292 (-0.18 %)
This reduces SGPRs spilling in MadMax, and it also reduces
number of SGPRs in DOW3 and F12017. The number of waves slightly
decreases in F1 but I don't see any performance changes after
benchmarking it. Talos and Serious Sam are not affected because
they don't use any push constants.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This is a very simple pass that just shrinks load_push_constant
intrinsics when some components are unused. For now, it can just
shrink vec4 to vec3, vec3 to vec2 and so on.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
The SPIR-V parser splits in/out struct variables and creates
a separate variable for each first-level member of the struct.
When the struct variable has an initializer this means that we also
need to split the initializer.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
According with OpenGL GLSL 3.20 spec, section 4.3.9:
"It is a link-time error if any particular shader interface
contains:
- two different blocks, each having no instance name, and each
having a member of the same name, or
- a variable outside a block, and a block with no instance name,
where the variable has the same name as a member in the block."
This fixes a previous commit 9b894c8 ("glsl/linker: link-error using the
same name in unnamed block and outside") that covered this case, but
did not take in account that precision qualifiers are ignored when
comparing blocks with no instance name.
With this commit, the original tests
KHR-GL*.shaders.uniform_block.common.name_matching keep fixed, and also
dEQP-GLES31.functional.shaders.linkage.uniform.block.differing_precision
regression is fixed, which was broken by previous commit.
v2: use helper varibles (Matteo Bruni)
Fixes: 9b894c8 ("glsl/linker: link-error using the same name in unnamed block and outside")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104668
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104777
CC: Mark Janes <mark.a.janes@intel.com>
CC: "18.0" <mesa-stable@lists.freedesktop.org>
Tested-by: Matteo Bruni <matteo.mystral@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
The materials are now moved to the end of the
generic attributes block to the range 4-15.
Before, the way the position and generic 0 attribute
is handled was dependent on the presence and kind of
the currently attached vertex program. With this
change the way the position attribute and the generic 0
attribute is treated only depends on the enabled
flag of those two arrays.
This will later help to untangle the update dependencies
between enabled arrays and shader inputs.
v2: s,VERT_ATTRIB_MAT_OFFSET,VERT_ATTRIB_MAT0,g
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
Instead of just assuming that the material attributes
just overlap with the generic attributes 0-12, give
them symbolic defines so that we can easier move them
to an other range.
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
Reviewed-by: Brian Paul <brianp@vmware.com>
min(a+b, c+d) >= 0 becomes (a+b >= 0 && c+d >= 0).
No shader-db changes, but it does prevent 6 to 12 instruction
regressions in the next patch on all measured Intel platforms.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
v2: Rebase on almost 2 years. Require that one of the arguments to fmin
or fmax be used only once. This prevents some regressions.
shader-db results:
Skylake and Broadwell had similar results. Skylake shown.
total instructions in shared programs: 14526021 -> 14525913 (<.01%)
instructions in affected programs: 4613 -> 4505 (-2.34%)
helped: 31
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 3.48 x̃: 4
helped stats (rel) min: 0.62% max: 6.67% x̄: 3.31% x̃: 2.42%
total cycles in shared programs: 533118710 -> 533118403 (<.01%)
cycles in affected programs: 34334 -> 34027 (-0.89%)
helped: 24
HURT: 0
helped stats (abs) min: 4 max: 24 x̄: 12.79 x̃: 14
helped stats (rel) min: 0.25% max: 2.40% x̄: 1.08% x̃: 1.03%
No changes on GM45, Iron Lake, Sandy Bridge, Ivy Bridge, or Haswell.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
We need this because we will always copy fs outputs to temps and
split the arrays, but do not want to do either of these with fs
inputs as it is unnessisary and makes handling interpolateAt
builtins difficult.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
c2acf97fcc changed the use of double_inputs_read to be
inconsitent with its previous meaning. Here we re-enable the
gather info code that was removed as the modified code from
c2acf97fcc now uses the double_inputs member rather than
double_inputs_read.
This change allows us to use double_inputs_read with gallium
drivers without impacting double_inputs which is used by i965.
We also make use of the compiler option vs_inputs_dual_locations
to allow for the difference in behaviour between drivers that handle
vs inputs as taking up two locations for doubles, versus those that
treat them as taking a single location.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Allows nir drivers to either use a single or dual locations for
vs double inputs.
i965 uses dual locations for both OpenGL and Vulkan drivers, for
now gallium OpenGL drivers only use a single location.
The following patch will also make use of this option when
calling nir_shader_gather_info().
Reviewed-by: Karol Herbst <kherbst@redhat.com>
First we move double_inputs_read into a vs struct in the union,
double_inputs_read is only used for vs inputs so this will
save space and also allows us to add a new double_inputs field.
We add the new field because c2acf97fcc changed the behaviour
of double_inputs_read, and while it's no longer used to track
actual reads in i965 we do still want to track this for gallium
drivers.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
This change cleans following scary warnings in valgrind output
when disk cache is being written:
==6532== Uninitialised byte(s) found during client check request
==6532== at 0x14423FAD: blob_write_bytes (blob.c:152)
==6532== by 0x144240FB: blob_write_uint32 (blob.c:194)
==6532== by 0x144001A5: write_tex (nir_serialize.c:613)
and later (loads of):
==6532== Use of uninitialised value of size 8
==6532== at 0x62FCD9E: crc32_z (in /usr/lib64/libz.so.1.2.11)
==6532== by 0x13F65014: util_hash_crc32 (crc32.c:127)
==6532== by 0x13F5DABA: cache_put (disk_cache.c:947)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Namely extend the EXTRA_DIST list, instead of re-assigning it and bring
back a file dropped by mistake.
Fixes: 436ed65d38 ("autotools: include meson build files in tarball")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Daniel Stone <daniels@collabora.com>
This adds the meson.build, meson_options.txt, and a few scripts that are
used exclusively by the meson build.
v2: - Remove accidentally included changes needed to test make dist with
LLVM > 3.9
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Acked-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
According with OpenGL GLSL 4.20 spec, section 4.3.9, page 57:
"It is a link-time error if any particular shader interface
contains:
- two different blocks, each having no instance name, and each
having a member of the same name, or
- a variable outside a block, and a block with no instance name,
where the variable has the same name as a member in the block."
This means that it is a link error if for example we have a vertex
shader with the following definition.
"layout(location=0) uniform Data { float a; float b; };"
and a fragment shader with:
"uniform float a;"
As in both cases we refer to both uniforms as "a", and thus using
glGetUniformLocation() wouldn't know which one we mean.
This fixes KHR-GL*.shaders.uniform_block.common.name_matching.
v2: add fixed tests (Tapani)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
To avoid compilation warnings and because this helper
shouldn't update anything.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This creates two new internal dependencies, idep_nir_headers and
idep_nir. The former encapsulates the generation of nir_opcodes.h and
nir_builder_opcodes.h and adding src/compiler/nir as an include path.
This ensures that any target that needs nir headers will have the
includes and that the generated headers will be generated before the
target is build. The second, idep_nir, includes the first and
additionally links to libnir.
This is intended to make it easier to avoid race conditions in the build
when using nir, since the number of consumers for libnir and it's
headers are quite high.
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Don't use intermediate variables, use consistent whitespace.
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
Currently the meosn build has a mix of two styles:
arg : [foo, ...
bar],
and
arg : [
foo, ...,
bar,
]
For consistency let's pick one. I've picked the later style, which I
think is more readable, and is more common in the mesa code base.
v2: - fix commit message
Acked-by: Eric Engestrom <eric.engestrom@imgtec.com>
Signed-off-by: Dylan Baker <dylan.c.baker@intel.com>
If MaxAttribs were ever raised to 32, undefined behavior would occur.
We had already gone to the effort (albeit incorrectly) handle this in
one case, so fix them all.
CID: 1369628
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
If max_index were ever 32, the linker would have marked all 32
locations as invalid instead of marking none of them as invalid. It's
a good thing the maximum value actually set by any driver for
MaxAttribs is 16.
Found by inspection while investigating CID 1369628.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
All cases where the result could be non-visit_continue would have
already returned.
CID: 401351, 1224465, 1224466
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
None of these are necessary because result->type is the only thing used
outside the giant switch-statement.
CID: 1230983, 1230984
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Intel was the only user and now NIR can do the lowering.
v2: do not try to handle it as a system value directly for the SPIR-V
path. In GL we rather handle it as a uniform like we do for the
GLSL path (Jason).
v3: drop LowerTESPatchVerticesIn as well (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
New generated files from:
bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
65fc16c974 ("autotools: set XA versions in configure.ac and configure header file")
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Technically, the GLSLang bug related to this can also affect SSBO writes
where the bool -> uint conversion is missing. However, the only known
shipping application with an old enough version of GLSLang to cause
issues with this is the new DOOM game so we keep the workaround as small
as possible.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104424
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Previously, we were storing a pointer to the vtn_value because we use it
to look up decorations when we create input/output variables. This
works, but it also may be useful to have the id itself so we may as well
store that instead.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Now that higher levels are enforcing decoration sanity, we don't need
the vtn_asserts here. This function *should* be safe but we still want
a few well-placed regular asserts in case something goes awry.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This reworks the error checking on our generic handling of decorations.
The objective is to validate all of the SPIR-V assumptions we make
up-front and convert redundant checks to compiled-out asserts. The most
important part of this is to ensure that member decorations only occur
on OpTypeStruct and that the member is never out-of-bounds. This way
later code can assume that the member is sane and not have to worry
about OOB array access due to a misplaced OpMemberDecorate.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This is a bit simpler since we have fewer enum values in the case. It's
also a bit more efficient because we're making fewer glsl_get_* calls.
While we're at it, add better type validation.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Now that vtn_base_type is a real and full base type, we can switch on
that instead of the GLSL base type which is a lot fewer cases in our
switch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Respect the std430 rules for determining offset and size of struct
members when using a std430 buffer. std140 rules lead to wrong buffer
offsets in that case.
Fixes my test case attached in Bugzilla. No piglit changes.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104492
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Using 4, as it is the default value on mesa. See mesa/main/config.h
and the following commit that introduced the value:
15ac66e331
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
ARB_transform_feedback3 sets a minimum of 1, ARB_gpu_shader5 a minimum
of 4. It shouldn't matter too much, so choosing the later.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Used to handle how many ubo you can define on the context. Minimimum
defined as 36 on ARB_uniform_buffer_object spec, up to 84 on OpenGL
4.6 (12 per stage at each moment).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Every now and then I execute the standalone compiler, get the
non-version error, and need to remember what I'm doing wrong
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
For GL_ARB_compute_variable_group_size
Reported-by: Karol Herbst <karolherbst@gmail.com>
Signed-off-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This reverts commit 9702fac68e, which
hangs vulkancts and crucible on all platforms.
The patch is being reverted because it disables continuous integration
testing. The patch from bug 104359 does not apply to master.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104359
This fixes a varying packing issue when using transform feedback in
GL_SEPARATE_ATTRIBS mode. By time we get to linking, we already
know that the number of feedback attributes is under the
GL_MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS limit so packing isn't
as critical. In fact, packing/splitting vec3 attributes can cause
trouble because splitting effectively creates another TFB output
which can exceed device limits. So, disable vec3 packing when it's
not needed to avoid that issue.
Fixes the Piglit ext_transform_feedback-separate test on VMware
driver.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
The mix of bitwise operators with * and + to compute the packing_class
values was a little weird. Just use bitwise ops instead.
v2: add assertion to make sure interpolation bits fit without collision,
per Timothy. Basically, rewrite function to be simpler.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Fixes: bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
When walking over all the cases in a OpSwitch, take in account the bitsize
of the literals to avoid getting wrong cases.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
If we don't remap and output this code would trample the outputs
read bits.
This fixes a regression in
dEQP-VK.tessellation.shader_input_output.barrier
Fixes: 1c9c42d16b (nir: add varying component packing helpers)
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
The Talos Principle contains shaders with an OpSelect between two
vectors where the condition is a scalar boolean. This is technically
against the spec bout nir_builder gracefully handles it by splatting
out the condition to all the channels. So long as the condition is a
boolean, just emit a warning instead of failing.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104246
We want the clamping of the coordinate to apply after the offset, so we
need to do math to lower the offset out of the instruction. Fixes texwrap
offset cases for GL_CLAMP with GL_NEAREST on vc5.
Note: I moved the get_texture_size() verbatim, so that it was defined
before use.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Commit bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
added generation of vtn_gather_types.c, but forgot to add it to the
Android build files.
Fixes: bb1e6ff161 ("spirv: Add a prepass to set types on vtn_values")
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Signed-off-by: Rob Herring <robh@kernel.org>
Pointers with no storage type are converted to inout variables but SSA
values and pointers with a storage type (which turns into a uint or
uvec2) are just input variables.
Previously, we just gave them exactly the same type as the respective
image (which already had a sampler2D or similar type). Now they have
their own base type and a pointer to the vtn_type for the image.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Instead of calling vtn_add_case for the default case and then looping,
add an is_default variable and do everything inside the loop. This will
make the next commit easier.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This autogenerated pass will automatically find and set the type field
on all vtn_values. This way we always have the type and can use it for
validation and other checks.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
At the moment, this just lets us drop the const_type for constants and
unify things a bit. Eventually, we will use this to store the types of
all SPIR-V SSA values.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Since we switched over to lowering SLM access directly in SPIR-V -> NIR,
we no longer have vtn_variables for SLM. It's all safe as with UBOs and
SSBOs but we need to let it through in the assert.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104213
Fixes: 8761a04d0d
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
There is no chain, so checking the length ends with a SEGFAULT.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103579
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The ARB_get_program_binary extension requires that uniform values in a
program be restored to their initial value just after linking.
This patch saves off the initial values just after linking. When the
program is restored by glProgramBinary, we can use this to copy the
initial value of uniforms into UniformDataSlots.
V2 (Timothy Arceri):
- Store UniformDataDefaults only when serializing GLSL as this
is what we want for both disk cache and ARB_get_program_binary.
This saves us having to come back later and reset the Uniforms
on program binary restores.
Signed-off-by: Timothy Arceri <tarceri@itsqueeze.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> (v1)
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
This will allow us to use the program serialization to implement
ARB_get_program_binary.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
To avoid any vulkan driver to include the GL mtypes.h. Renamed as
eventually this could be used by drivers not using nir.
v2: remove compiler/spirv/spirv.h from mtypes (Alejandro)
v3: added the definition at compiler/shader_info.h (Jason Ekstrand)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In that case, nir_eval_const_opcode() will evaluate the conversion
but as it was using destination's bit_size, the resulting
value was just a cast of the source constant value. By passing the
source's bit size, it does the conversion properly.
Fixes:
dEQP-VK.spirv_assembly.instruction.*.opspecconstantop.*convert*
v2:
- Remove invalid conversion op cases.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This fixes a crash in:
KHR-GL45.enhanced_layouts.xfb_block_stride
Fixes: 0822517936 "glsl: add helper to process xfb qualifiers during linking"
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Until now it was part of spirv_to_nir_options. But it will be used on
the implementation of ARB_gl_spirv and ARB_spirv_extensions, and added
to the OpenGL context, as a way to save what SPIR-V capabilities the
current OpenGL implementation supports.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
SpvOpFConvert now manages the FPRoundingMode decorator for the
returning values enabling the nir_rounding_mode in the conversion
operation to fp16 values.
v2: Fixed breaking of specialization constants. (Jason Ekstrand)
v3: Avoid nir_rounding_mode * casting. (Jason Ekstrand)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
v2: Added more missing implementations of 16-bit types. (Jason Ekstrand)
v3: Store values in values[0].u16[i] (Jason Ekstrand)
Include switches based on bitsize for 16-bit types
(Chema Casanova)
v4: Coding style fixes (Jason Ekstrand)
Use vtn_u64_literal and u64[0] at 64-bit SpvOpConstant (Jason Ekstrand)
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
nir_type_conversion enables new operations to handle rounding modes to
convert to fp16 values. Two new opcodes are enabled nir_op_f2f16_rtne
and nir_op_f2f16_rtz.
The undefined behaviour doesn't has any effect and uses the original
nir_op_f2f16 operation.
v2: Indentation fixed (Jason Ekstrand)
v3: Use explicit case for undefined rounding and assert if
rounding mode is used for non 16-bit float conversions
(Jason Ekstrand)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This will include the following NIR ALU opcodes:
* nir_op_i2i16
* nir_op_i2f16
* nir_op_u2u16
* nir_op_u2f16
* nir_op_f2i16
* nir_op_f2u16
* nir_op_f2f16
v2: Remove "from" 16-bit in commit subject (Topi Pohjolainen)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
v2: Renamed glsl_half_float_type() to glsl_float16_t_type().
(Jason Ekstrand)
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Adds new INT16, UINT16 and FLOAT16 base types.
The corresponding GL types for half floats were reused from the
AMD_gpu_shader_half_float extension. The int16 and uint16 types come from
NV_gpu_shader_5 extension.
This adds the builtins and the lexer support.
To avoid a bunch of warnings due to cases not handled in switch, the
new types have been added to a few places using same behavior as
their 32-bit counterparts, except for a few trivial cases where they are
already handled properly. Subsequent patches in this set will provide
correct 16-bit implementations when needed.
v2: * Use FLOAT16 instead of HALF_FLOAT as name of the base type.
* Removed float16_t from builtin types.
* Don't copy 16-bit types as if they were 32-bit values in
copy_constant_to_storage().
* Use get_scalar_type() instead of adding a new custom switch
statement.
(Jason Ekstrand)
v3: Use GL_FLOAT16_NV instead of GL_HALF_FLOAT for consistency
(Ilia Mirkin)
v4: Add missing 16-bit base types support in glsl_to_nir (Eduardo Lima).
v5: Fix coding style (Topi Poholainen).
Signed-off-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Eduardo Lima <elima@igalia.com>
Signed-off-by: Alejandro Piñeiro <apinheiro@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
The SPIR-V spec is a bit underspecified when it comes to exactly how
you're allowed to use OpPtrAccessChain and what it means in certain edge
cases. In particular, what if the base pointer of the OpPtrAccessChain
points to the base struct of an SSBO instead of an element in that SSBO.
The original variable pointers implementation in mesa assumed that you
weren't allowed to do an OpPtrAccessChain that adjusted the block index
and asserted such. However, there are some CTS tests that do this and,
if the CTS does it, someone will do it in the wild so we should probably
handle it. With this commit, we significantly reduce our assumptions
and should be able to handle more-or-less anything.
The one assumption we still make for correctness is that if we see an
OpPtrAccessChain on a pointer to a struct decorated block that the block
index should be adjusted. In theory, someone could try to put an array
stride on such a pointer and try to make the SSBO an implicit array of
the base struct and we would not give them what they want. That said,
any index other than 0 would count as an out-of-bounds access which is
invalid.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
This is required for being able to handle OpPtrAccessChain in SPIR-V
where the base type of the incoming pointer requires us to add to the
block index instead of the byte offset.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Before, we always left workgroup variables as shared nir_variables and
let the driver call nir_lower_io. This adds an option to do the
lowering directly in spirv_to_nir. To do this, we implicitly assign the
variables a std430 layout and then treat them like a UBO or SSBO and
immediately lower all the way to an offset.
As a side-effect, the spirv_to_nir pass now handles variable pointers
for workgroup variables.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Up until now, all pointers have been ivec2s. We're about to add support
for pointers to workgroup storage and those are going to be uints.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
There is no good reason why we should have the same logic repeated in
get_vulkan_resource_index and vtn_ssa_offset_pointer_dereference. If
we're a bit more careful about how we do things, we can just use the one
function and get rid of the other entirely. This also makes the push
constant special case a lot more clear.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
This commit moves them both into vtn_variables.c towards the top, makes
them take a vtn_builder, and replaces a hand-rolled instance of
is_external_block with a function call.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
This makes us key off of !offset instead of !block_index. It also puts
the guts inside a switch statement so that we can handle more than just
UBOs and SSBOs.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
This parallels what we do for vtn_block_load except that we don't yet
support anything except SSBO loads through this path.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
This is equivalent and means we don't have resource index code scattered
about.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
These helpers are much nicer than just using assert because they don't
kill your process. Instead, it longjmps back to spirv_to_nir(), cleans
up all the temporary memory, and nicely returns NULL. While crashing is
completely OK in the Vulkan world, it's not considered to be quite so
nice in GL. This should help us to make SPIR-V parsing much more
robust. The one downside here is that vtn_assert is not compiled out in
release builds like assert() is so it isn't free.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>
We may as well log the source language and file name.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <idr@freedesktop.org>