Rather than hardcoding glcpp/other use `basename "$0"` which expands
appropriatelly.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Current definitions work fine for the manual invokation of the script,
although the whole script does not consider that one can run it OOT.
The latter will be handled with latter patches, although it will be
extensively using the two variables.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
The current "let's print any folder which exists" is simply confusing.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
The relative/absolute path brings little to no benefit in being printed
as testname. Trim it out.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
We don't want to lie ourselves that 'everything is fine' when no tests
were found/ran.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Rather than hardcoding the binary location (which ends up wrong in a
number of occasions) in the python script, pass it as argument.
This allows us to remove a couple of dirname/basename workarounds that
aimed to keep this working, and succeeded in the odd occasion.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
v2: use -eq over a string comparison (Eric)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
At the moment we look for generator script(s) in builddir while they
are in srcdir, and we proceed to generate the tests and expected output
in srcdir, which is not allowed.
To untangle:
- look for the generator script in the correct place
- generate the files in builddir, by extending create_test_cases.py to
use --outdir
With this in place the test passes `make check' for OOT builds - would
that be as standalone or part of `make distcheck'
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Bail out early if the script is not where we expect it to be.
v2: use -f instead of -e. latter returns true on folder(s)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Now that we have srcdir we can use it to correctly manage/point to the
script. Effectively fixing OOT invokation of `make check'.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
There is no robust way to detect either one, so simply hope for the best
and warn just in case.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Otherwise we'll fail when invoking the script outside of "make check"
v2: use -ne over a string comparison (Eric)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Spamming the log with the (in some cases extremely long) test location
is of limited use.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
We don't want to lie ourselves that 'everything is fine' when no tests
were found/ran.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Before this commit, we would effectively fail to run any of the test in
a OOT builds.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
There is no robust way to detect either one, so simply hope for the best
and warn just in case.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
... or non-executable, in particular.
v2: use test -x (Eric)
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
We're going to use them with the next commits to determine where to put
the generated tests and/or built binaries.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
With later commits we'll fix the generators to produce the files in the
correct location. That in itself will cause an issue since the files
will be left dangling and make distcheck will fail.
v2: Use -r only as needed (Eric)
Cc: Matt Turner <mattst88@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Previously we would just escape the loop and move everything
following the loop inside the if to the else branch of a new if
with a return flag conditional. However everything outside the
if the loop was nested in would still get executed.
Adding a new return to the then branch of the new if fixes this
and we just let a follow pass clean it up if needed.
Fixes:
tests/spec/glsl-1.10/execution/vs-nested-return-sibling-loop.shader_test
tests/spec/glsl-1.10/execution/vs-nested-return-sibling-loop2.shader_test
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
From page 45 (page 52 of the PDF) of the GLSL ES 3.00 v.6 spec:
" When instance names are present on matched block names, it is
allowed for the instance names to differ; they need not match for
the blocks to match.
From page 51 (page 57 of the PDF) of the GLSL 4.30 v.8 spec:
" When instance names are present on matched block names, it is
allowed for the instance names to differ; they need not match for
the blocks to match."
Therefore, no cross linking validation is needed for the instance name
of an Interface Block.
This patch will make that no link error will be reported on a program
like this:
"# VS
layout(binding = 1) Block1 {
vec4 color;
} uni_block;
...
# FS
layout(binding = 2) Block2 {
vec4 color;
} uni_block;
..."
Fixes GL45-CTS.enhanced_layouts.ssb_layout_qualifier_conflict
Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
From page 140 (page 147 of the PDF) of the GLSL ES 3.10 v.4 spec:
" 9.2 Matching of Qualifiers
The following tables summarize the requirements for matching of
qualifiers. It applies whenever there are two or more matching
variables in a shader interface.
Notes:
1. Yes means the qualifiers must match.
...
9.2.1 Linked Shaders
| Qualifier | Qualifier | in/out | Default | uniform | buffer|
| Class | | | Uniforms | Block | Block |
...
| Layout | binding | N/A | Yes | Yes | Yes |"
From page 93 (page 110 of the PDF) of the GL 4.2 (Core Profile) spec:
" 2.11.7 Uniform Variables
...
Uniform Blocks
...
When a named uniform block is declared by multiple shaders in a
program, it must be declared identically in each shader. The
uniforms within the block must be declared with the same names and
types, and in the same order. If a program contains multiple
shaders with different declarations for the same named uniform
block differs between shader, the program will fail to link."
From page 129 (page 150 of the PDF) of the GL 4.3 (Core Profile) spec:
" 7.8 Shader Buffer Variables and Shader Storage Blocks
...
When a named shader storage block is declared by multiple shaders
in a program, it must be declared identically in each shader. The
buffer variables within the block must be declared with the same
names, types, qualification, and declaration order. If a program
contains multiple shaders with different declarations for the same
named shader storage block, the program will fail to link."
Therefore, if the binding qualifier differs between two linked Uniform
or Shader Storage Blocks of the same name, a link error should happen.
This patch will make that a link error will be reported on a program
like this:
"# VS
layout(binding = 1) Block {
vec4 color;
} uni_block1;
...
# FS
layout(binding = 2) Block {
vec4 color;
} uni_block2;
..."
Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
While it's legal to have an active blocks count > 0 on link failure.
Unless we actually assign memory for the blocks array we can end up
segfaulting in calls such as glUniformBlockBinding().
To avoid having to NULL check these api calls we simply reset the
block count to 0 if the array was not created.
Signed-off-by: Andres Gomez <agomez@igalia.com>
Cc: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
The checks were only looking at the first byte, while the intention
seems to be to check if the whole sha1 is zero. This prevented all
shaders with first byte zero in their sha1 from being saved.
This shaves around a second from Deus Ex load time on a hot cache.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Even though the programs themselves stay in cache and are loaded, the
shader keys can be evicted separately. If that happens, unnecessary
compiles are caused that waste time, and no matter how many times the
program is re-run, performance never recovers to the levels of first
hot cache run. To deal with this, we need to refresh the shader keys
of shaders that were recompiled.
An easy way to currently observe this is running Deux Ex, then piglit
and Deux Ex again, or deleting just the cache index. The later is
causing over a minute of lost time on all later Deux Ex runs, with this
patch it returns to normal after 1 run.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Fix 'make check' linking errors with glibc < 2.17.
CXXLD glsl/glsl_test
glsl/.libs/libglsl.a(libmesautil_la-u_queue.o): In function `u_thread_get_time_nano':
src/util/../../src/util/u_thread.h:84: undefined reference to `clock_gettime'
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
While at it, also fix up a failure message to not reference timestamp
and gpu dirs as those are no longer being made.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
According to section 14.6 of the Vulkan specification:
"When sample shading is enabled, the x and y components of FragCoord
reflect the location of the sample corresponding to the shader
invocation."
So add a boolean parameter to the lowering pass to select this behavior
when we need it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This allows to get rid of the arch and gpu name directories.
v2: (Timothy Arceri) don't use an opaque data type to store
pointer size and gpu name.
v3: (Timothy Arceri) use blob to store driver keys just make sure
to store null terminator for strings, and make sure blob is
defined by disk_cache and not it's users.
v4: (Timothy Arceri) fix typo, and make ptr_size a uint8_t.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Instead of using a directory, hash the timestamps into the cache keys
themselves. Since there is no more timestamp directory, there is no more
need for deleting the cache of other mesa versions and we rely on
eviction to clean up the old cache entries. This solves the problem of
using several incarnations of disk_cache at the same time, where one
deletes a directory belonging to the other, like when both OpenGL and
gallium nine are used simultaneously (or several different mesa
installations).
v2: using additional blob instead of trying to clone sha1 state
v3: (Timothy Arceri) don't use an opaque data type to store
timestamp.
V4: (Timothy Arceri) use blob to store driver keys just make sure
to store null terminator for strings, and make sure blob is
defined by disk_cache and not it's users.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100091
Fix linking error on CentOS 6.
CXXLD glsl_compiler
glsl/.libs/libstandalone.a(lt16-libmesautil_la-u_queue.o): In function `u_thread_get_time_nano':
src/util/../../src/util/u_thread.h:84: undefined reference to `clock_gettime'
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Fixed with the following command:
perl -pe 'BEGIN{undef $/;} s/ \\\n\n/\n\n/smg' $(find . -name 'Android.*')
Signed-off-by: Rob Herring <robh@kernel.org>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Through the glsl headers we had an odd mix of guards be that
"ifndef", "pragma once" neither or both.
Simplify things by using the more common ones (ifndef) and annotating
all the sources, barring the generated builting header -
builtin_int64.h.
The final header - udivmod64.h - is [seemingly] unused and on its way
out (patch purge it is on the mailing list).
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Acked-by: Vedran Miletić <vedran@miletic.net>
Acked-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>
Fix build with Python < 2.7.
File "src/compiler/nir/nir_builder_opcodes_h.py", line 46, in <module>
from nir_opcodes import opcodes
File "src/compiler/nir/nir_opcodes.py", line 178, in <module>
unop_convert("{}2{}{}".format(src_t[0], dst_t[0], bit_size),
ValueError: zero length field name in format
Fixes: 762a6333f2 ("nir: Rework conversion opcodes")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
There is no need to hardcode it, we can just use blob_key[0].
This is needed because the next patches are going to change how cache
keys are computed.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
This will allow to hash additional data into the cache keys or even
change the hashing algorithm easily, should we decide to do so.
v2: don't try to compute key (and crash) if cache is disabled
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Since we already do fabs on the one source, we're guaranteed to get
positive infinity if we get any infinity at all. Since +inf only has
one IEEE 754 representation, we can use an integer comparison and avoid
all of the ordered/unordered issues.
Cc: Dave Airlie <airlied@redhat.com>
Reviewed-by: Elie Tournier <elie.tournier@collabora.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Previously each time we saw a variable we just created a duplicate
entry in the list. This is particularly bad for loops were we add
everything twice, and then throw nested loops into the mix and the
list was growing expoentially.
This stops the glsl-vs-unroll-explosion test which has 16 nested
loops from reaching the tests mem usage limit in this pass. The
test now hits the mem limit in opt_copy_propagation_elements()
instead.
I suspect this was also part of the reason this pass can be so
slow with some shaders.
Reviewed-by: Thomas Helland <thomashelland90@gmail.com>
Unused/unchecked by any of the callers.
v2: Fix the glsl cases that have crept in since v1
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Currently only a one in one out eviction so if at max_size and
cache files were to constantly increase in size then so would the
cache. Restrict to limit of 8 evictions per new cache entry.
V2: (Timothy Arceri) fix make check tests
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
This should help reduce any overhead added by the shader cache
when programs are not found in the cache.
To avoid creating any special function just for the sake of the
tests we add a one second delay whenever we call dick_cache_put()
to give it time to finish.
V2: poll for file when waiting for thread in test
V3: fix poll delay to really be 100ms, and simplify the wait function
Reviewed-by: Grazvydas Ignotas <notasas@gmail.com>
Apart from avoiding some unneeded size cases, this shouldn't have any
actual functional impact.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The NIR story on conversion opcodes is a mess. We've had way too many
of them, naming is inconsistent, and which ones have explicit sizes was
sort-of random. This commit re-organizes things and makes them all
consistent:
- All non-bool conversion opcodes now have the explicit size in the
destination and are named <src_type>2<dst_type><size>.
- Integer <-> integer conversion opcodes now only come in i2i and u2u
forms (i2u and u2i have been removed) since the only difference
between the different integer conversions is whether or not they
sign-extend when up-converting.
- Boolean conversion opcodes all have the explicit size on the bool and
are named <src_type>2<dst_type>.
Making things consistent also allows nir_type_conversion_op to be moved
to nir_opcodes.c and auto-generated using mako. This will make adding
int8, int16, and float16 versions much easier when the time comes.
Reviewed-by: Eric Anholt <eric@anholt.net>
The original version was very convoluted and tried way too hard to not
just have the nested switch statement that it needs. Let's just write
the obvious code and then we know it's correct. This fixes a bunch of
missing cases particularly with int64.
Reviewed-by: Plamena Manolova <plamena.manolova@intel.com>
The original bit-size validation wasn't capable of properly dealing with
instructions with variable bit sizes. An attempt was made to handle it
by looking at source and destinations but, because the validation was
done in validate_alu_(src|dest), it didn't really have the needed
information. The new validation code is much more straightforward and
should be more correct.
Reviewed-by: Eric Anholt <eric@anholt.net>
We've always required bit sizes to match but the rules for number of
components have been a bit loose. You've never been allowed to source
from something with less components than you consume, but more has
always been fine. This changes the validator to require that they match
exactly. The fact that they don't always match has been a source of
confusion in NIR for quite some time and it's time we got rid of it.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Using coord_components of the source texture is correct for everything
except cube maps where it's off by one.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Some SPIR-V texturing instructions pack more than the texture coordinate
into the coordinate source. We need to mask off the unused channels.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
In the near future we are going to require that the num_components in a
src dereference match the num_components of the SSA value being
dereferenced. To do that, we need copy_prop to not remove our MOVs from
a larger SSA value into an instruction that uses fewer channels.
Because we suddenly have to know how many components each source has,
this makes the pass a bit more complicated. Fortunately, copy
propagation is the only pass that cares about the number of components
are read by any given source so it's fairly contained.
Shader-db results on Sky Lake:
total instructions in shared programs: 13318947 -> 13320265 (0.01%)
instructions in affected programs: 260633 -> 261951 (0.51%)
helped: 324
HURT: 1027
Looking through the hurt programs, about a dozen are hurt by 3
instructions and the rest are all hurt by 2 instructions. From a
spot-check of the shaders, the story is always the same: They get a
vec4 from somewhere (frequently an input) and use the first two or three
components as a texture coordinate. Because of the vector component
mismatch, we have a mov or, more likely, a vecN sitting between the
texture instruction and the input. This means that the back-end inserts
a bunch of MOVs and split_virtual_grfs() goes to town. Because the
texture coordinate is also used by some other calculation, register
coalesce can't combine them back together and we end up with an extra 2
MOV instructions in our shader.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Because we optimistically skip compiling shaders if we have seen them
before we may need to compile them later at link time if they haven't
yet been use in a specific combination to create a program.
Rather than always recompiling we take advantage of the
gl_compile_status enum introduced in the previous patch to only
compile when we have previously skipped compilation.
This helps with regressions in app start-up times on cold cache
runs, compared with no cache.
Deus Ex: Mankind Divided start-up times:
cache disabled: ~3m15s
cold cache master: ~4m23s
cold cache with this patch: ~3m33s
Acked-by: Marek Olšák <marek.olsak@amd.com>
This will allow us to tell if a shader really has been compiled or
if the shader cache has just seen it before.
Acked-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Andreas Boll <andreas.boll.dev@gmail.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
All of the scripts are [must be] executed via $PYTHON2 [or equivalent]
hence why they are missing the execute bit.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Nearly all the python scripts used in-tree are invoked via $PYTHON2 or
equivalent. As such having the execute bit not needed and generally
ill-advised.
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
This makes it easier/clearer as to:
- if the file should have the execute bit set (.py should not)
- do we need the shebang in the first place and if so what it should be
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
This allows iterating list nodes from a given start point instead of
necessarily the list head.
Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Since blob is intended for serializing data, it's not a good idea to
leave padding holes with uninitialized data, which may leak heap
contents and hurt compression if the blob is later compressed, like
done by shader cache. Clear it.
Signed-off-by: Grazvydas Ignotas <notasas@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Builtins are created once and allocated using their own private ralloc
context. When reparenting IR that includes builtins, we might be steal
bits of builtins. This is problematic because these builtins might now
be freed when the shader that includes then last is disposed. This
might also lead to inconsistent ralloc trees/lists if shaders are
created on multiple threads.
Rather than including builtins directly into a shader's IR, we should
include clones of them in the ralloc context of the shader that
requires them. This fixes double free issues we've been seeing when
running shader-db on a big multicore (72 threads) server.
v2: Also rename _mesa_glsl_find_builtin_function_by_name() to better
reflect how this function is used. (Ken)
v3: Rename ctx to mem_ctx (Ken)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reverts commit 0f60c6616e.
Piglit and all games tested so far seem to be working without
issue. This change will allow wide user testing and we can decided
before the next release if we need to turn it off again.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
The previous implementation was fine for GLSL which doesn't really have
a signed modulus/remainder. They just leave the behavior undefined
whenever either source is negative. However, in SPIR-V, there is a
defined behavior for negative arguments. This commit beefs up the pass
so that it handles both correctly. Tested using a hacked up version of
the Vulkan CTS test to get 64-bit support.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Previously, when q.subroutine was set to 1, a new subroutine
declaration was added to the AST, while 0 meant a subroutine
definition has been detected by the parser.
Thus, setting the q.subroutine flag in both situations is
obviously wrong because a new type identifier is added instead
of trying to match the declaration. To fix it up, introduce
ast_type_qualifier::is_subroutine_decl() to differentiate
declarations and definitions easily.
This fixes a regression with:
arb_shader_subroutine/compiler/direct-call.vert
Cc: Mark Janes <mark.a.janes@intel.com>
Fixes: be8aa76afd ("glsl: remove unecessary flags.q.subroutine_def")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100026
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
The algorithms used by this pass, especially for division, are heavily
based on the work Ian Romanick did for the similar int64 lowering pass
in the GLSL compiler.
v2: Properly handle vectors
v3: Get rid of log2_denom stuff. Since we're using bcsel, we do all the
calculations anyway and this is just extra instructions.
v4:
- Add back in the log2_denom stuff since it's needed for ensuring that
the shifts don't overflow.
- Rework the looping part of the pass to be easier to expand.
Reviewed-by: Matt Turner <mattst88@gmail.com>