This fixes the Android build after the move of builtin_stubs.cpp into
the builtin_compiler subdirectory. This patch is untested.
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Note this by itself is not enough to fix scons build -- it will fail
until you remove:
rm -rf build/*/glsl/builtin_compiler
because that node was a filei before, but it will be now a directory.
This also means that bisecting across this change will require wiping
the build directory..
The builtin_compiler binary is used during the build process to generate
code for the builtin GLSL functions. Since this binary needs to be run
on the build host, it must not be cross-compiled.
This patch fixes the build system to compile a second version of the
source files and the builtin_compiler binary itself for the build
system. It does so by defining the CC_FOR_BUILD and CXX_FOR_BUILD
variables, which are searched for by the configure script and point to
the location of native C and C++ compilers.
In order for this to work properly, builtin_function.cpp is removed
from BUILT_SOURCES, otherwise the build system would try to generate it
before having had a chance to descend into the builtin_compiler
subdirectory. With the builtin_compiler and glsl_compiler now being
generated at different stages, the build instructions for glsl_compiler
can be simplified a bit.
Signed-off-by: Thierry Reding <thierry.reding@avionic-design.de>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
For Intel, expose it only if gen >= 4.
For Gallium, expose it only if PIPE_CAP_SM3 is advertised.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Global initializers using the ?: operator with at least one non-constant
operand generate ir_if statements. For example,
float foo = some_boolean ? 0.0 : 1.0;
becomes:
(declare (temporary) float conditional_tmp)
(if (var_ref some_boolean)
((assign (x) (var_ref conditional_tmp) (constant float (0.0))))
((assign (x) (var_ref conditional_tmp) (constant float (1.0)))))
This pattern is necessary because the second or third arguments could be
function calls, which create statements (not expressions).
The linker moves these global initializers into the main() function.
However, it incorrectly had an assertion that global initializer
statements were only assignments, calls, or temporary variable
declarations. As demonstrated above, they can be if statements too.
Other than the assertion, everything works fine. So remove it.
Fixes new Piglit test condition-08.vert, as well as an upcoming
game that will be released on Steam.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Previously, we used lookahead patterns to differentiate:
#define FOO(x) function macro
#define FOO (x) object macro
Unfortunately, our rule for function macros:
{HASH}define{HSPACE}+/{IDENTIFIER}"("
relies on infinite lookahead, and apparently triggers a Flex bug where
the generated code overflows a state buffer (see YY_STATE_BUF_SIZE).
There's no need to use infinite lookahead. We can simply change state,
match the identifier, and use a single character lookahead for the '('.
This apparently makes Flex not generate the giant state array, which
avoids the buffer overflow, and should be more efficient anyway.
Fixes piglit test 17000-consecutive-chars-identifier.frag.
NOTE: This is a candidate for every release branch ever.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Carl Worth <cworth@cworth.org>
When specifying per-target CFLAGS (e.g., ralloc_test_CFLAGS) AM_CFLAGS
are not used. AM_CPPFLAGS should be used for includes anyway.
Fixes a build problem since 41b14d125:
CC ralloc_test-ralloc.o
In file included from ../../../src/glsl/ralloc.c:42:0:
../../../src/glsl/ralloc.h:57:27: fatal error: main/compiler.h: No such file or directory
Acked-by: Paul Berry <stereotype441@gmail.com>
Catches problems such as (in the gles3 branch)
glcpp-parse.y: In function '_glcpp_parser_handle_version_declaration':
glcpp-parse.y:1990:39: warning: format '%lli' expects argument of type
'long long int', but argument 4 has type 'int' [-Wformat]
As a side-effect, remove ralloc.c's likely/unlikely macros and just use
the ones from main/compiler.h.
NOTE: This is a candidate for the release branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Port the 'glcpp: fix abuse of yylex' commit to Android.mk
Also, since the Android.*.mk are sourced in a global namespace,
the local-y-to-c-and-h is prefixed with the LOCAL_MODULE name,
The initial fix commit is 53d46bc787
There's also a bugzilla for this: 54947
Signed-off-by: Negreanu Marius Adrian <adrian.m.negreanu@intel.com>
Reviewed-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fixes this build error on Cygwin.
Explicit dependency `src/glsl/builtins/tools/texture_builtins.py' not
found, needed by target
`build/cygwin-x86-debug/glsl/builtin_function.cpp'.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
The anonymous namespace should keep these private classes to file scope,
preventing clashes with other symbols of the same name elsewhere.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
I can't see any reason this is global (unless for debugging)
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
No reason for this to be global from what I can see
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
This symbol with dricore escapes into the namespace, its too generic,
we should prefix it with something just to be nice.
Should be applied to stable + 9.0
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
So glcpp tried to workaround yylex its own way, but failed,
do it properly.
This fixes another crash found after fixing the first crash.
this is a candidate for 9.0 and stable branches
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
According to the GLSL 4.30 specification, this is a compile time error.
Earlier specifications don't specify a behavior, but since 0 and 1 are
the only valid indices for dual source blending, it makes sense to
generate the error.
Fixes (the fixed version of) piglit's layout-12.frag.
NOTE: This is a candidate for the 9.0 branch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
We don't fully process the builtin uniforms, but at least
num_uniform_components reflects reality now.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
When too may uniforms are used, the error will be caught in
check_resources (src/glsl/linker.cpp).
NOTE: This is a candidate for the 8.0 branch.
Signed-off-by: Stéphane Marchesin <marcheu@chromium.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: Benoit Jacob <bjacob@mozilla.com>
Commit 77a3efc6b9 broke android build that
sets its own value for GLSL_SRCDIR before including Makefile.sources.
Patch moves overriding the value after include, this works as GLSL_SRCDIR
variable gets expanded only later.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Like in src/mesa, use GLSL_BUILDDIR/GLSL_SRCDIR to unambiguously
distinguish between in-tree and generated files.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Christopher James Halse Rogers <christopher.halse.rogers@canonical.com>
We were only propagating it to the API when the variable was a matrix type,
but we were still tripping over it in lower_ubo_reference when it was set on a
vector.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We were getting the base offset of a vec2, not of a vec2[2] like the quoted
spec text says we should.
v2: Fix swapped then/else cases.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Previously, we were returning the index into the UniformBlocks of one of the
linked shaders, when it's supposed to be the program global index.
Fixes piglit getactiveuniformsiv-uniform_block_index.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
v2: Reduce the impenetrable code in emit_ubo_loads() by 23 lines by keeping
the ir_variable as the variable part of the offset from handle_rvalue(),
and track the constant offsets from that with a plain old integer value,
avoiding a bunch of temporary variables in the array and struct handling.
Also, fix file description doxygen.
v3: Fix a row vs col typo, and fix spelling in a comment.
Reviewed-by: Eric Anholt <eric@anholt.net>
For the UBO lowering pass, I want to see the whole dereference chain for
replacing, not the innermost ir_dereference_variable.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Drivers will probably want to be able to take UBO references in a
shader like:
uniform ubo1 {
float a;
float b;
float c;
float d;
}
void main() {
gl_FragColor = vec4(a, b, c, d);
}
and generate a single aligned vec4 load out of the UBO. For intel,
this involves recognizing the shared offset of the aligned loads and
CSEing them out. Obviously that involves breaking things down to
loads from an offset from a particular UBO first. Thus, the driver
doesn't want to see
variable_ref(ir_variable("a")),
and even more so does it not want to see
array_ref(record_ref(variable_ref(ir_variable("a")),
"field1"), variable_ref(ir_variable("i"))).
where a.field1[i] is a row_major matrix.
Instead, we're going to make a lowering pass to break UBO references
down to expressions that are obvious to codegen, and amenable to
merging through CSE.
v2: Fix some partial thoughts in the ir_binop comment (review by Kenneth)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When converting var->location from pointing at the program's UniformBlocks to
pointing at the linked shader's UniformBlocks, I missed this change. It
usually worked out in the end because the two lists happen to be the same in
many testcases.
Fixes a valgrind complaint on
oglconform ubo-compile.cpp advanced.std140.2stage
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Before, the GLSL parser was getting rebuilt every time that scons was
run. The problem was scons was expecting a glsl_parser.hpp file but
we were generating a glsl_parser.h file.
Signed-off-by: Brian Paul <brianp@vmware.com>
Previously, we advertised the extension but the builtin functions
were enabled only for GLSL and not for ES.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=52003
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
I ended up having to add rallocing of the ast_type_qualifier in order
to avoid pulling in ast.h for glsl_parser_extras.h, because I wanted
to track an ast_type_qualifier in the state.
Fixes piglit ARB_uniform_buffer_object/row-major.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Yes, you get to say things like "layout(row_major, column_major)" and
get column major.
Part of fixing piglit ARB_uniform_buffer_object/row_major.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The previous implementation required a flag in _mesa_glsl_parse_state
and line of code to initialize it for every version of the shading
language we intend to support. As we look to add 150, 330, 400, 410,
420, and beyond, this gets rather unwieldy.
This patch retains the switch statement (to reject, say, #version 111),
but removes all the bits. Code to check for ctx->API == API_OPENGL_CORE
could easily be added to the 110 and 120 cases to reject those.
v2: Use _mesa_is_desktop_gl to preserve the existing behavior in the
presence of the new API_OPENGL_CORE enumeration.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
It was using state->Const.GLSL_100ES, which is set if the driver
supports ARB_ES2_compatibility or we're in ES2 mode. Instead, it should
use state->language_version, as that represents the actual GLSL version
of the shader being compiled.
Since the correct logic is < 120 && !100, just make it == 110.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Now that ir_quadop_vector exists, ir_last_binop and ir_last_opcode are
no longer the same. Only one place currently uses this enumeration, and
already handles ir_quadop_vector correctly.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Olivier Galibert <galibert@pobox.com>
It's more convenient to use shortcuts like glsl_type::bvec2_type than
the longwinded glsl_type::get_instance(GLSL_TYPE_BOOL, 2, 1).
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Olivier Galibert <galibert@pobox.com>
Otherwise, the preprocessor happily outputs
#line 2 4 <your next line of code>
and the main compiler gets horribly confused and fails to compile.
This is not the right solution (line numbers in error messages will
likely be off-by-one in certain circumstances), but until Carl comes
up with a proper fix, this gets programs running again.
Fixes regressions in Regnum Online, Overgrowth, Piglit, and others since
commit aac78ce823.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51802
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51506
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=41152
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
The OpenGL(R) ES Shading Language
Version 1.00 Revision 17 (12 May, 2009)
> 4.6.1 The Invariant Qualifier
> ... To force all output variables to be invariant, use the pragma
> #pragma STDGL invariant(all)
Signed-off-by: Oliver McFadden <oliver.mcfadden@linux.intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Mesa misses a few checks when compiling on a uclibc system
which cause it to fall back on glibc-ism. This patch
addresses those issues.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Anthony G. Basile <blueness@gentoo.org>
At this point in the linking, we've totally lost track of the struct
gl_uniform_buffer that this pointed to in the original unlinked
shader, so we do a nasty n^2 walk to find it the new one based on the
variable name.
Note that these point into the shader's list of gl_uniform_buffers,
not the linked program's.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We're going to need this structure to cross-validate the uniform
blocks between shader stages, since unused ir_variables might get
dropped. It's also the place we store the RowMajor qualifier, which
is not part of the GLSL type (since that would cause a bunch of type
equality checks to fail).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Previously, the code for setting this flag for GLSL programs was
duplicated in three places: brw_link_shader(), glsl_to_tgsi_visitor,
and ir_to_mesa_visitor. In addition to the unnecessary duplication,
there was a performance problem on i965: brw_link_shader() set the
flag before doing its final round of optimizations, which meant that
if the optimizations managed to eliminate all the discard operations,
the flag would still be set, resulting (at least in theory) in slower
performance.
This patch consolidates all of the code that sets UsesKill for GLSL
programs into do_set_program_inouts(), which already is doing a
similar job for UsesDFdy, and which occurs after i965's final round of
optimizations.
Non-GLSL programs (ARB programs and the state tracker's glBitmap
program) are unaffected.
Reviewed-by: Eric Anholt <eric@anholt.net>
Presumably the function didn't exist when we wrote this code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This patch updates the ir_set_program_inouts_visitor so that it also
sets gl_fragment_program::UsesDFdy.
This is a bit of a hack (since dFdy() isn't an input or an output),
but there's no other obvious visitor to squeeze this functionality
into, and it would be silly to create a brand new visitor just for
this purpose.
v2: use local 'fprog' var to avoid repeated casting.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Helps spotting and removing the obsolete generated files, which otherwise break
the build.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Commit 68e04cc6 was tested using automake-1.11. Unfortunately, automake-1.12
made a "slightly backward-incompatible change" in the use of yacc with C++, and
for a .yy file, the generated header file is now named .hh, not .h
To work with both, write our own rule for running yacc, which generates a
header file named .h, rather than using automake's rule.
Also, remove things from BUILD_SOURCES which don't need to be there
Also, update EXCLUDE rules in doxygen/glsl.doxy, for change of generated files
from .cpp -> .cc, and glsl_lexer.h has never existed.
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
* "configure substitutions are not allowed in _SOURCES variables" in automake, so instead of
MESA_ASM_FILES, use some AM_CONDITIONALS to choose which architecture's asm sources are used
in libmesa_la_SOURCES. (Can't remove MESA_ASM_FILES autoconf variable as it's still used in
sources.mak)
* Update to link with the .la file in other Makefile.am files, and make a link to the
.a file for the convenience of other Makefiles which have not yet been converted to automake
v2: Remove stray -static from LDFLAGS
v3: Remove .a compatibility link on clean
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matt Turner <mattst88@gmail.com>
v2: Use AM_V_GEN to silence generated code rules. Add BUILT_SOURCES to CLEANFILES
v3:
- Fix an accidental // in a path
- Use automake make rules for lex/yacc rather than writing our own
- Update .gitignore appropriately
- Build a libglcpp convenience library rather than awkwardly including
the files in libglsl and delegating the generation
- Remove libglsl.a compatibility link on clean
v4:
- Automake's rules for lex/yacc make .cc if source is .ll or .yy, and apparently we
must use those extensions "because of scons", so update everywhere glsl_parser.cpp
-> glsl_parser.cc and glsl_lexer.cpp -> glsl_lexer.cc. This fixes 'make tarballs'
and building with dricore enabled.
Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Matt Turner <mattst88@gmail.com>
This swizzles away unwanted components, while preserving the order of
the ones that remain.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
I needed to compute logs and square roots in a patch I was working on,
and wanted to use the convenient interface. We already have a similar
constructor for binops; adding one for unops seems reasonable.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
I ran into this while trying to create a TXS query, which doesn't have a
coordinate. Since it didn't get initialized to NULL, a bunch of
visitors tried to access it and crashed.
Most of the time, this won't be a problem, but it's just a good idea.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
This doesn't do anything with the uniform block declarations yet, so
usage of those uniforms finds them to be undeclared.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
I've been trying to derive from this for UBO support, and the slightly
obfuscated types were putting me over the edge.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The got_one variable was set iff one of the bits in flags.i was set.
v2: Fix incorrect dropping of the ARB_conservative_depth warning.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (v1)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Commit 0c005bd7 intended to make ir_loop_jump::mode public, but also
accidentally added a new pointer to the enclosing loop. Furthermore, it
tried to initialize the new field by adding "this->loop = loop;" to the
constructor, but since there is no loop parameter, this only initialized
the field to itself---so it will likely be a garbage pointer.
A lot of code, such as lower_jumps, allocates new loop jumps without
setting this field appropriately, so any uses would probably just crash.
Thankfully, there were none, so we can just delete the field.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=51574
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Previously, we were counting gl_FrontFacing, gl_FragCoord and gl_PointCoord
against the limit of varying variables. This prevented some valid shaders
from linking.
The other potential solution to this is to have the driver advertise
more varying vars or set the GLSLSkipStrictMaxVaryingLimitCheck flag.
But the above-mentioned variables aren't conventional varying attributes
so it doesn't seem right to count them.
Reviewed-by: Eric Anholt <eric@anholt.net>
The most recent commit adds support for comments and macro expansion
on #line directives. Add testing to verify the new features.
Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The GLSL specification requires that #line directives be interpreted
after macro expansion. Our existing implementation of #line macros in
the lexer prevents conformance on this point.
Moving the handling of #line from the lexer to the parser gives us the
macro expansion we need. An additional benefit is that the
preprocessor also now supports comments on the same line as #line
directives.
Finally, the preprocessor now emits the (fully-macro-expanded) #line
directives into the output. This allows the full GLSL compiler to also
see and interpret these directives so it can also generate correct
line numbers in error messages.
Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This function is currently used only in the expansion of #if lines,
but we will soon be using it more generally (for the expansion of
(_glcpp_parser_expand_and_lex_from) and some more documentation.
Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Commit b823b99ec0 switched from using
functions such as ralloc_asprintf and ralloc_strcat to
ralloc_asprintf_rewrite_tail. This change maintains the string's
length as a aparamter that is updated by the ralloc functions (rather
than recomputing it with strlen over and over).
However, the change failed to updated two locations (glcpp_error and
glcpp_warning), with the result that the string's length wasn't
updated by these calls. Then, subsequent calls to other
ralloc_asprintf_rewrite_tail would overwrite the text appended by
glcpp_error.
This commit fixes the two missing updates, and restores line numbers
to the output of glcpp error messages, (as noticed by a glcpp unit
test case that has been failing since the above-mentioned commit).
Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
A strict reading of the GLSL specification would have this be an
error, but we've received reports from users who expect the
preprocessor to interepret undefined macros as 0. This is the standard
behavior of the rpeprocessor for C, and according to these user
reports is also the behavior of other OpenGL implementations.
So here's one of those cases where we can make our users happier by
ignoring the specification. And it's hard to imagine users who really,
really want to see an error for this case.
The two affected tests cases are updated to reflect the new behavior.
Signed-off-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This bitfield tells the back-ends which of a fragment shader's inputs
require centroid interpolation. It is only set for GLSL fragment
shaders, since assembly fragment shaders don't support centroid
interpolation.
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes this build failure on Solaris.
Compiling build/sunos-debug/glsl/glcpp/glcpp-lex.c ...
"src/glsl/glcpp/glcpp-lex.l", line 30: cannot find include file: "glcpp-parse.h"
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Previously, we performed conversions from float->uint by a two step
process: float->int->uint. However, on platforms that use saturating
conversions (e.g. i965), this didn't work, because if the source value
was larger than the maximum representable int (0x7fffffff), then
converting it to an int would clamp it to 0x7fffffff.
This patch just adds the new opcode; further patches will adapt
optimization passes and back-ends to use it, and then finally the
ast_to_hir logic will be modified to emit the new opcode.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
In single precision, 1.5707963 becomes 1.5707962513 which is too
small. However, 1.5707964 becomes 1.5707963705 which is just right.
The value 1.5707964 is already used in asin.ir.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Determines whether it's a basis vector, i.e., a vector with one element
equal to 1 and all other elements equal to 0.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Because these classes are used entirely from their own source files
and not from separate DSOs, the linker gets to produce massively less
code. This cuts about 13k of text in the libdricore case. In the
non-libdricore case, the additional linkage information allows the
compiler to inline some code, so libglsl.a size actually increases by
about 300 bytes.
For a dricore build, improves shader_runner runtime on
glsl-fs-copy-propagation-texcoords-1 by 0.21% +/- 0.03% (n=353574,
outliers removed). No statistically significant difference with n=322
on glslparsertest on a yofrankie shader intended to test compiler
performance.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Now we have just one library of "all of Mesa core" instead of both
libdricore and libglsl that drivers link against.
I did this change in a sort of nonrecursive make fashion: the
generated files are still produced in the non-automake build, like the
rest of dricore, but the GLSL files are stuffed into libdricore
without building a convenience library in src/glsl (even though we
could now). This would make a bit more sense if glsl was just another
dir under src/mesa, because right now I had to contort the prefix
variable name to look another ../ level up.
That adds support for activating the extension. It doesn't actually
*do* anything yet, of course.
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
While ~loop_state() is already freeing the loop_variable_state objects
via ralloc_free(this->mem_ctx), the ~loop_variable_state() destructor
was never getting called, so the hash table inside loop_variable_state
was never getting destroyed.
Fixes a memory leak in any shader with loops.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
We were incorrectly assuming that the coordinate's dimensionality is
equal to the gradient's dimensionality. For array types, the coordinate
has one more component.
Fixes 12 subcases of oglconform's glsl-bif-tex-grad test.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
These look like debug messages from the switch-statement development.
NOTE: This is a candidate for the 8.0 release branch.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Some distributions (like Arch Linux) make /usr/bin/python Python 3,
rather than Python 2. Since compare_ir uses /usr/bin/env python,
such systems will fail to run optimization-test, causing 'make check' to
always fail.
Automake's TESTS_ENVIRONMENT variable provides a mechanism to run
programs or set environment variables in the test environment.
Ideally, I think we would want to use AM_TESTS_ENVIRONMENT, since
TESTS_ENVIRONMENT is supposed to be user-overridable. However, it isn't
supported using the default/serial test runner.
Fixes 'make check' on Arch Linux and Gentoo.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Acked-by: Matt Turner <mattst88@gmail.com>
I started writing unit tests for a new piece of code, and discovered
they all failed due to a bug in ralloc. Clearly it needs a test suite.
v2: Rename to 'ralloc-test' and fix copyright date. (idr review)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
If an object is allocated out of the NULL context, info->parent will be
NULL. Using the PTR_FROM_HEADER macro would be incorrect: it would say
that ralloc_parent(ralloc_context(NULL)) == sizeof(ralloc_header).
Fixes the new "null_parent" unit test.
NOTE: This is a candidate for the 7.9, 7.10, 7.11, and 8.0 branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
When considering which components of a variable were killed by an
assignment, constant propagation would previously just use the write
mask of the assignment. This worked if the LHS of the assignment was
simple, e.g.:
v.xy = ...; // (assign (xy) (var_ref v) ...)
But it did the wrong thing if the LHS of the assignment involved an
array indexing operator, since in this case the write mask is always
(x):
v[i] = ...; // (assign (x) (deref_array (var_ref v) (var_ref i)) ...)
In general, we can't predict which vector component will be selected
by array indexing, so the only safe thing to do in this case is to
kill the entire variable.
Fixes piglit tests {fs,vs}-vector-indexing-kills-all-channels.shader_test.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
v2: Put unit tests in src/glsl/tests rather than tests/glsl.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
The spec requires that samplers be initialized to 0. Since this
differs from the 1-to-1 mapping of samplers to texture units assumed
by ARB assembly shaders (and the gl_program structure), be sure to
propagate this date from the gl_shader_program to the gl_program.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
CC: Vadim Girlin <vadimgirlin@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49088
v2: Fix handling of arrays-of-structure. Thanks to Eric Anholt for
pointing this out.
v3: Minor comment change based on feedback from Ken.
Fixes piglit glsl-1.20/execution/uniform-initializer/fs-structure-array
and glsl-1.20/execution/uniform-initializer/vs-structure-array.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
It's an implied argument, and I don't think being explicit about it
helps.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The comment quotes spec saying that only scalar integers are allowed,
but we only checked for integer.
Fixes piglit switch-expression-const-ivec2.vert
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
I managed to completely trash it in 22d81f15.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Total instructions: 261582 -> 261316
135/2147 programs affected (6.3%)
36752 -> 36486 instructions in affected programs (0.7% reduction)
This excludes a tropics shader that now gets 16-wide mode and throws
off the numbers. 5 shaders are hurt: two extra MOVs in 4 tropics
shaders it looks like because we don't split register names according
to independent webs, and one gstreamer shader where it looks like
try_rewrite_rhs_to_dst() is falling on its face.
This should also help avoid a regression in VSes from idr's ARB
programs to GLSL work.
Previously, I tried implementing this in the i965 driver, but did so
in a way that violated the intent of the spec, and broke Tropics.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This points to the object with the function body, allowing us to map
from a built-in prototype to the actual body with IR code to execute.
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
- copy_masked_offset copies part of a constant into another,
assign-like.
- copy_offset copies a constant into (a subset of) another,
funcall-return like.
These methods are to be used to trace through assignments and function
calls when computing a constant expression.
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
The method is used to get a reference to an ir_constant * within the
context of evaluating an assignment when calculating a
constant_expression_value.
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net> [v1]
We were looping over all the vector components, but only dealing with
the first one. This was masked by the fact that constant expression
handling on built-ins went through custom code for the lessThan()
/function/ rather than the ir_binop_less expression operator.
NOTE: This is a candidate for all release branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Olivier Galibert <galibert@pobox.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
When doing the var->assigned change in
f2475ca424, I overzealously indented the
second block of code into the "if (var)" test. Revert these blocks to
the way they were before, just taking advantage of "var" to avoid
re-calling variable_referenced().
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49066
I only considered var->assigned for FragColor and FragData, but
ignored when it was false for out vars. Fixes piglit
write-gl_FragColor-and-not-user-output.frag
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=49068
The index is also used for GL_ARB_blend_func_extended. Cloning in
i965 was dropping a non-ARB_explicit_attrib_location index.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Fix uninitialized scalar field defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fix uninitialized pointer field defect reported by Coverity.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This is taken from the ogl-math project, with Inverse renamed to adj
(since it's not actually the inverse), transposed, and our types
plugged in. There are potential CSE opportunities in this code
(particularly for hardware with RCP but not DIV), but we should be
doing CSE anyway, so don't hand-optimize.
Fixes piglit inverse tests.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
This takes advantage of the builtin compiler to generate IR into a
string, the same way we read GLSL for function prototypes for our
profiles.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
I keep getting lost in the Makefile trying to figure out what to edit
to work on builtin_compiler or glsl_compiler.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We were checking for these at link time previously, which is not as
early as mandated, and would actually fail to detect conflicting
writes if dead code removal removed some writes.
Fixes failures in piglit
glsl-*/compiler/fragment-outputs/write-gl_Frag*
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This will be used for some compile-and-link-time error checking, where
currently we've been doing error checking only at link time.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This runs optimization-test and produces the usual automake test
output, which may be interesting to automated build systems.
This doesn't convert the tests to be individually exposed to the
automake runner, because automake doesn't like wildcards (due to being
nonportable in make, not that we care).
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is the reason the declaration member existed in the reference
visitor, but I didn't copy the code from structure splitting that
avoided setting it.
This wasn't currently a problem, because we don't allow splitting of
in/out variables. But that would be nice to change some day.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This was carried over from structure splitting, without thinking about
whether the name still made sense in this context.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Vinson reported that we failed to initialize this, which would lead to
all kinds of crashes if we actually used it. Since we don't use it,
we may as well just delete the broken code.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Deletes a lot of pointless duplication, as well as some run-time effort.
Conveniently, GLSL 1.40 no longer needs a .vert variant, since it
doesn't define any built-ins specific to the vertex shader stage.
ARB_texture_rectangle and OES_EGL_image_external also only need a single
profile, since the .vert and .frag variants were identical.
I didn't bother with EXT_texture_array and OES_texture_3D because
they're so tiny that the savings would be miniscule.
Cuts the generated builtin_function.cpp from 1.7MB to 1.0MB (41%).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
The built-in subsystem uses "profiles," or GLSL shaders containing
prototypes for all built-ins supported within a particular language
version (or extension) and shader stage.
Since profiles were stage-specific, we had to cut and paste almost all
the prototypes between (e.g.) 110.vert and 110.frag. Naturally, this
led to sundry cut and paste bugs, where someone fixed an issue in .frag
but neglected to update .vert, or vice-versa. Geometry shaders would
have only made this worse.
This patch introduces support for a new '.glsl' profile suffix which
contains prototypes common to all shader stages. The existing '.frag'
and '.vert' profiles need only contain the few stage-specific built-ins.
Not only does this remove duplication, it makes built-in setup slightly
faster: we don't need to re-read the common prototypes and function
bodies for both the vertex and fragment shader stage.
Internally, this was trivial. We already create a list of gl_shader
objects to search through for built-ins: one for the core language
version/stage, and additional shaders for any extensions in use. This
patch simply adds another shader to the list: core/common, core/stage,
and extensions.
The next patch will update the profiles to remove the duplication.
It's separated out purely to make review easier.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
These ought to be treated as 'any stage', but for now, they're just
treated as vertex shaders.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Acked-by: Ian Romanick <ian.d.romanick@intel.com>
The GLSL 1.30 -> 4.10 specs all erroneously say "vec2" for a few
overloads of textureProjGradOffset, while most overloads and all other
texturing functions use ivec types.
The GLSL 4.20 specification corrects these to "ivec2", but doesn't
mention this as being a conscious change in behavior. Nor does the
ARB_shading_language_420pack extension. So presumably it was a typo.
At any rate, our builtin functions all use ivec already, so the fact
that these prototypes use plain vecs will only lead to applications
dying in a fire when trying to use them.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
This reverts commit 4ec449a6ed.
I meant to not push this one. Review found that a link error is not
mandated: it should link, but you get undefined rendering if you rely
on a missing stage.
page 42/55 section 2.11 "Vertex Shaders":
"If the program object has no vertex shader, or no program object
is currently in use, the results of vertex shader execution are
undefined."
(and similar for page 160/173 section 3.9 "Fragment Shaders" for FS,
and page 45/58 section 2.11.2 "Program Objects" for program being 0)
It turns out the commit was broken anyway, because it was missing a
"goto done", so linkstatus got smashed back to true later and the
error just showed up as a warning in the infolog.
Fixes the new piglit texelFetch() tests on these. Note that the rest
of the new functions are not tested (same as the non-2DRect versions
of most of them).
The non-integer versions were already reserved in 1.30, but apparently
these were forgotten.
Fixes piglit glsl-1.40/compiler/reserved/
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cuts 8/1068 instructions from glyphy's fragment shaders on i965.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This lets us significantly shorten p->instructions->push_tail(ir), and
will be used in a few more places.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Now we can fold a bunch of our expression setup in ff_fragment_shader
into single-line, parseable commits.
v2: Make it actually work. I wasn't setting num_components in the
mask structure, and not setting up a mask structure is way easier.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Having to explicitly dereference is irritating and bloats the code,
when the compiler can detect and do the right thing.
v2: Use a little shim class to produce the automatic dereference
generation at compile time as opposed to runtime, while also
allowing compile-time type checking.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The C++ constructors with placement new, while functional, are
extremely verbose, leading to generation of simple GLSL IR expressions
like (a * b + c * d) expanding to many lines of code and using lots of
temporary variables. By creating a new ir_builder.h that puts simple
generators in our namespace and taking advantage of ralloc_parent(),
we can generate much more compact code, at a minor runtime cost.
v2: Replace ir_instruction usage with just ir_rvalue.
v3: Drop remaining missed as_rvalue() in v2.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This adds index support to the GLSL compiler.
I'm not 100% sure of my approach here, esp without how output ordering
happens wrt location, index pairs, in the "mark" function.
Since current hw doesn't ever have a location > 0 with an index > 0,
we don't have to work out if the output ordering the hw requires is
location, index, location, index or location, location, index, index.
But we have no hw to know, so punt on it for now.
v2: index requires layout - catch and error
setup explicit index properly.
v3: drop idx_offset stuff, assume index follow location
Signed-off-by: Dave Airlie <airlied@redhat.com>
Add implementations of the two API functions,
Add a new strings to uint mapping for index bindings
Add the blending mode validation for SRC1 + SRC_ALPHA_SATURATE
Add get for MAX_DUAL_SOURCE_DRAW_BUFFERS
v2:
Add check in valid_to_render to address case in spec ERRORS.
v3:
Add index to ir.h so this patch compiles on its own
fixup comment
v4: fixup Brian's comments
The GLSL patch will setup the indices.
Signed-off-by: Dave Airlie <airlied@redhat.com>
This should fit in well with our lower_mat_op_to_vec code: now, in
addition to having expressions on each column of a matrix, we also
split the columns to separate variables so they can be tracked
individually by the copy propagation, dead code, and other passes.
This optimizes out some more code generation in unigine and gstreamer
shaders.
Total instructions: 269342 -> 269270
14/2148 programs affected (0.7%)
2226 -> 2154 instructions in affected programs (3.2% reduction)
I've had this code laying around almost done for a long time. The
idea is like opt_structure_splitting, that we've got a bunch of
transforms at the GLSL IR level that only understand scalars and
vectors, which just skip complicated dereferences. While driver
backends may manage some optimization after they split matrices up
themselves, it would be better to bring all of our optimization to
bear on the problem.
While I wasn't expecting changes quite yet, a few programs end up
winning: a gstreamer convolution shader, and the Humus dynamic
branching demo:
Total instructions: 269430 -> 269342
3/2148 programs affected (0.1%)
1498 -> 1410 instructions in affected programs (5.9% reduction)
Use the hash of the variable name instead of the pointer value.
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fix texelFetch(sampler2DRect) and textureSize(samplerBuffer)
generation to not reference a LOD at the same time because it's easier
than not fixing it.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The samplerBuffer type will be undefined in !glsl 1.40, and the
keyword is marked as reserved. The [iu]samplerBuffer types are not
marked as reserved pre-1.40, so they don't have separate tokens and
fall through to normal type handling.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We have lexer recognition of a bunch of our types based on the
handling. This code was mapping those recognized tokens to an enum
and then to a string of their name. Just drop the enums and provide
the string directly in the parser.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Nothing actually relied on them being mutable, and there was at least
one cast which discarded const qualifiers. The next patch would have
introduced many more.
Casting away const qualifiers should be avoided if at all possible.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Variables have types, expression trees have types, but statements don't.
Rather than have a nonsensical field that stays NULL in the base class,
just move it to where it makes sense.
Fix up a few places that lazily used ir_instruction even though they
actually knew the particular subclass.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Previously, set_callee() performed some assertions about the type of the
ir_call; protecting the bare pointer ensured these checks would be run.
However, ir_call no longer has a type, so the getter and setter methods
don't actually do anything useful. Remove them in favor of accessing
callee directly, as is done with most other fields in our IR.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Aside from ir_call, our IR is cleanly split into two classes:
- Statements (typeless; used for side effects, control flow)
- Values (deeply nestable, pure, typed expression trees)
Unfortunately, ir_call confused all this:
- For void functions, we placed ir_call directly in the instruction
stream, treating it as an untyped statement. Yet, it was a subclass
of ir_rvalue, and no other ir_rvalue could be used in this way.
- For functions with a return value, ir_call could be placed in
arbitrary expression trees. While this fit naturally with the source
language, it meant that expressions might not be pure, making it
difficult to transform and optimize them. To combat this, we always
emitted ir_call directly in the RHS of an ir_assignment, only using
a temporary variable in expression trees. Many passes relied on this
assumption; the acos and atan built-ins violated it.
This patch makes ir_call a statement (ir_instruction) rather than a
value (ir_rvalue). Non-void calls now take a ir_dereference of a
variable, and store the return value there---effectively a call and
assignment rolled into one. They cannot be embedded in expressions.
All expression trees are now pure, without exception.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Most of the time, we just want to read an ir_dereference, so there's no
need to have these in separate functions. However, the next patch will
want to read an ir_dereference_variable directly.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
When translating a call from AST to HIR, we need to decide whether it
can be evaluated to a constant before emitting any code (namely, the
temporary declaration, assignment, and call.)
Soon, ir_call will become a statement taking a dereference of where to
store the return value, rather than an rvalue to be used on the RHS of
an assignment. It will be more convenient to try evaluation before
creating a call. ir_function_signature seems like a reasonable place.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Currently, ir_call can be used as either a statement (for void
functions) or a value (for non-void functions). This is rather awkward,
as it's the only class that can be used in both forms.
A number of places use ir_call::get_error_instruction() to construct a
generic value of error_type. If ir_call is to become a statement, it
can no longer serve this purpose.
Unfortunately, none of our classes are particularly well suited for
this, and creating a new one would be rather aggrandizing. So, this
patch introduces ir_rvalue::error_value(), a static method that creates
an instance of the base class, ir_rvalue. This has the nice property
that you can't accidentally try and access uninitialized fields (as it
doesn't have any). The downside is that the base class is no longer
abstract.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
generate_call() and ast_function_expression::hir() both tried to verify
that 'out' and 'inout' parameters used l-values. Irritatingly, it
turned out that this was not redundant; both checks caught -some- cases.
This patch combines the two into a single "complete" function that does
all the parameter mode checking. It also adds a comment clarifying why
AST-level checking is necessary in the first place.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
We used to have one big function, match_signature_by_name, which found
a matching signature, performed out-parameter conversions, and generated
the ir_call. As the code for matching against built-in functions became
more complicated, I split it internally, creating generate_call().
However, I left the same awkward interface. This patch splits it into
three functions:
1. match_signature_by_name()
This now takes a name, a list of parameters, the symbol table, and
returns an ir_function_signature. Simple and one purpose: matching.
2. no_matching_function_error()
Generate the "no matching function" error and list of prototypes.
This was complex enough that I felt it deserved its own function.
3. generate_call()
Do the out-parameter conversion and generate the ir_call. This
could probably use more splitting.
The caller now has a more natural workflow: find a matching signature,
then either generate an error or a call.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Function calls may have side effects that alter variables used inside
the loop. In the fragment shader, they may even terminate the shader.
This means our analysis about loop-constant or induction variables may
be completely wrong.
In general it's impossible to determine whether they actually do or not
(due to the halting problem), so we'd need to perform conservative
static analysis. For now, it's not worth the complexity: most functions
will be inlined, at which point we can unroll them successfully.
Fixes Piglit tests:
- shaders/glsl-fs-unroll-out-param
- shaders/glsl-fs-unroll-side-effect
NOTE: This is a candidate for release branches.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Fixes a Coverity resource leak defect.
NOTE: This is a candidate for the 8.0 branch.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
A later error prints this properly, fix this case to do the same.
v2: remove attribute as per Ian's suggestion
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This also seems like a bad idea. There were too many instances for me
to thoroughly scan the code as I did with the last two patches, but a
quick scan indicated that most callers newly allocate a variable,
dereference it, or NULL-check. In some cases, it wasn't clear that the
value would be non-NULL, but they didn't check for error_type either.
At any rate, not checking for this is a bug, and assertions will trigger
it earlier and more reliably than returning error_type.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The constructor currently returns a ir_dereference_variable of error
type when provided NULL, but that's about to change in the next commit.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Providing a NULL pointer to the ir_dereference_record() constructor
seems like a bad idea. Currently, if provided NULL, it returns a
partially constructed value of error type. However, none of the callers
are prepared to handle that scenario.
Code inspection shows that all callers do one of the following:
- Already NULL-check the argument prior to creating the dereference
- Already deference the argument (and thus would crash if it were NULL)
- Newly allocate the argument.
Thus, it should be safe to simply assert the value passed is not NULL.
This should also catch issues right away, rather than dying later.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Providing a NULL pointer to the ir_dereference_array() constructor seems
like a bad idea. Currently, if provided NULL, it returns a partially
constructed value of error type. However, none of the callers are
prepared to handle that scenario.
Code inspection shows that all callers do one of the following:
- Already NULL-check the argument prior to creating the dereference
- Already deference the argument (and thus would crash if it were NULL)
- Newly allocate the argument.
Thus, it should be safe to simply assert the value passed is not NULL.
This should also catch issues right away, rather than dying later.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
ir_validate.cpp: In member function ‘virtual ir_visitor_status ir_validate::visit_leave(ir_swizzle*)’:
ir_validate.cpp:458:66: warning: narrowing conversion of ‘ir->ir_swizzle::mask.ir_swizzle_mask::x’ from ‘unsigned int’ to ‘int’ inside { } is ill-formed in C++11 [-Wnarrowing]
ir_validate.cpp:458:66: warning: narrowing conversion of ‘ir->ir_swizzle::mask.ir_swizzle_mask::y’ from ‘unsigned int’ to ‘int’ inside { } is ill-formed in C++11 [-Wnarrowing]
ir_validate.cpp:458:66: warning: narrowing conversion of ‘ir->ir_swizzle::mask.ir_swizzle_mask::z’ from ‘unsigned int’ to ‘int’ inside { } is ill-formed in C++11 [-Wnarrowing]
ir_validate.cpp:458:66: warning: narrowing conversion of ‘ir->ir_swizzle::mask.ir_swizzle_mask::w’ from ‘unsigned int’ to ‘int’ inside { } is ill-formed in C++11 [-Wnarrowing]
Signed-off-by: Dave Airlie <airlied@redhat.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
valgrind complained about an uninitialised value being used in
glsl_parser_extras.cpp, and this was the one it was giving out about.
Just initialise the value in the fakectx.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Instead of the hard-coded value of 32. Note that MaxUnrollIterations
defaults to 32 so there's no net change. But the gallium state tracker
can override this.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
To avoid redundancies, this patch also removes .deps, .libs, and *.la
from .gitignore files in subdirectories.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
By setting lod to 0 in the builtin function implementation, we avoid
needing to update all the visitors to ignore LOD in this case, when
the hardware drivers actually want to ask for LOD 0 for rectangular
textures.
Fixes piglit spec/GLSL-1.40/textureSize-*Rect.
v2: Change style of looking for substrings.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This is the one builtin function claimed to be dropped due to the
ARB_compatibility split.
Fixes piglit spec/GLSL-1.40/compiler/ftransform.vert
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This makes the process slightly more debuggable, though it would be
nice if the build just failed immediately instead.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Mostly this is a matter of removing variables that have been moved to
the compatibility profile. There's one addition: gl_InstanceID is
present in the core now.
This fixes the new piglit tests for GLSL 1.40 builtin variables.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This avoids extra if statements in the common case of just comparing
two expressions that don't involve assignments or function calls,
along with simplifying the handling of constant expressions. Reduces
i965 instructions generated in unigine tropics and sanctuary,
yofrankie, warsow, gstreamer shaders, and the weston compositor.
shader-db results:
Total instructions: 213052 -> 212752
38/1246 programs affected (3.0%)
14309 -> 14009 instructions in affected programs (2.1% reduction)
Before, we were only counting top-level instructions. But if we have
an assignment of a giant expression tree (such as the ones eventually
generated by glsl-fs-unroll), we were counting the same as an
assignment of a variable deref.
glsl-fs-unroll-explosion now fails in a reasonable amount of time on
i965 because the unrolling didn't go ridiculously far.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Originally ARB_draw_instanced only specified that ARB decorated name.
Since no vendor actually implemented that behavior and some apps use
the undecorated name, the extension now specifies that both names are
available.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
The ralloc string appending functions were originally intended for
simple, non-hot-path uses like printing to an info log.
Cuts Unigine Tropics load time by around 20% (6 seconds).
v2: Avoid strlen() on every newline, too.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Acked-by: José Fonseca <jfonseca@vmware.com> [v1]
Both callers of rewrite_tail immediately compute the new total string
length by adding the (known) length of the existing string plus the
length of the newly appended text. Unfortunately, callers generally
won't know the length of the new text, as it's printf-formatted.
Since ralloc already computes this length, it makes sense to add it in
and save the caller the effort. This simplifies both existing callers,
but more importantly, will allow for cheap-appending in the next commit.
v2: The link_uniforms code needs both the old and new length.
Apply the obvious fix (which sadly makes it less of a cleanup).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Acked-by: José Fonseca <jfonseca@vmware.com> [v1]
Avoid unrollong loops that are either nested loops or
where the loop body times the unroll count is huge.
The change is far from being perfect but it extends the
loop unrolling decision heuristic by some additional
safeguard. In particular this cuts down compilation of
a shader precomputing atmospheric scattering integral
tables containing two nesting levels in a loop from
something way beyond some minutes (I never waited for
it to finish) to some fractions of a second.
This fixes piglit tests glsl-fs-unroll-explosion and
glsl-vs-unroll-explosion on r600g.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
The build was broken by the line below, added in commit 4f82fed4.
s_expression.cpp:26: #include <limits>
Mesa's half of the fix is to add 'external/astl/include' to the include
path. The other half of the fix requires implementing
numeric_limits<float>::infinity() in astl, for which I have patches
submitted upstream for review.
Signed-off-by: Chad Versace <chad.versace@linux.intel.com>
The error message I chose matches gcc's error. Fixes piglit
switch-case-duplicated.vert.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Otherwise, the upcoming error messages said the location was 0:0(0).
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
It's not quite spelled out in the spec text, but the grammar indicates
that only constant values are allowed as switch() case labels (and
only constant values make sense, anyway).
Fixes piglit glsl-1.30/compiler/switch-statement/switch-case-uniform-int.vert.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This stuffs them all in a struct for sanity. Fixes piglit
glsl-1.30/execution/switch/fs-uniform-nested.
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
In commit 6ecee54a9a a call to
talloc_reference was replaced with a call to talloc_steal. This was in
preparation for moving to ralloc which doesn't support reference
counting.
The justification for talloc_steal within token_list_append in that
commit is that the tokens are being copied already. But the copies are
shallow, so this does not work.
Fortunately, the lifetime of these tokens is easy to understand. A
token list for "replacements" is created and stored in a hash table
when a function-like macro is defined. This list will live until the
macro is #undefed (if ever).
Meanwhile, a shallow copy of the list is created when the macro is
used and the list expanded. This copy is short-lived, so is unsuitable
as a new parent.
So we can just let the original, longer-lived owner continue to own
the underlying objects and things will work.
This fixes bug #45082:
"ralloc.c:78: get_header: Assertion `info->canary == 0x5A1106'
failed." when using a macro in GLSL
https://bugs.freedesktop.org/show_bug.cgi?id=45082
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for stable release branches.
This test cases exposes a bug as described in this bug report:
"ralloc.c:78: get_header: Assertion `info->canary == 0x5A1106'
failed." when using a macro in GLSL
https://bugs.freedesktop.org/show_bug.cgi?id=45082
Clearly, some memory is getting (incorrectly) freed on the first macro
invocation, leading to problems with the second macro invocation.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The trick here is that flex always chooses the rule that matches the most
text. So with a input text of "two:" which we want to be lexed as an
IDENTIFIER token "two" followed by an OTHER token ":" the previous OTHER
rule would match longer as a single token of "two:" which we don't want.
We prevent this by forcing the OTHER pattern to never match any
characters that appear in other constructs, (no letters, numbers, #,
_, whitespace, nor any punctuation that appear in CPP operators).
Fixes bug #44764:
GLSL preprocessor doesn't replace defines ending with ":"
https://bugs.freedesktop.org/show_bug.cgi?id=44764
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
NOTE: This is a candidate for stable release branches.
This demonstrates a bug that was recently triggered in piglit.
Here is the original bug report (containing a test case almost identical
to this one):
https://bugs.freedesktop.org/show_bug.cgi?id=44764
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Success was (tests-passed AND valgrind-tests-passed) but this meant that
if the valgrind tests weren't run it would be considered a failure.
The logic is now (tests-passed AND (!valgrind OR valgrind-tests-passed))
which lets us return success if the valgrind tests aren't run.
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Matt Turner <mattst88@gmail.com>
automake uses variables named *_SOURCES.
Reviewed-by: Eric Anholt <eric@anholt.net>
Tested-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Matt Turner <mattst88@gmail.com>
Exporting a publicly visible class with a generic name like
"variable_entry" via ir_variable_refcount.h is kind of mean.
Many IR transformers would like to define their own "variable_entry"
class. If they accidentally include this header, the compiler/linker
may get confused and try to instantiate the wrong variable_entry class,
leading to bizarre runtime crashes.
The hope is that renaming this one will allow .cpp files to safely
declare and use their own file-scope "variable_entry" classes.
This avoids crashes caused by converting src/glsl to automake.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fix this GCC warning on non-debug builds.
glsl_types.cpp: In member function 'gl_texture_index
glsl_type::sampler_index() const':
glsl_types.cpp:157: warning: control reaches end of non-void function
NOTE: This is a candidate for the 8.0 branch.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The nvc0 gallium driver is advertising 128 MAX_INTERLEAVED_COMPS
which made it always assert in the linker when TFB was used since
the Outputs array was smaller than that maximum.
v2: added assertions
NOTE: This is a candidate for the 8.0 branch.
Reviewed-by: Paul Berry <stereotype441@gmail.com>