We've been allowing `centroid` and `sample` in all kinds of weird places
where they're not valid.
Insist that `sample` is combined with `in` or `out`;
and that `centroid` is combined with `in`, `out`, or the deprecated
`varying`.
V2: Validate this in a more sensible place. This does require an extra
case for uniform blocks members and struct members, though, since they
don't go through the normal path.
V3: Improve error message wording; eliminate redundant error generation
for inputs in VS or outputs in FS.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The ARB_gpu_shader5 spec says:
"To determine whether the conversion for a single argument in one match is
better than that for another match, the following rules are applied, in
order:
1. An exact match is better than a match involving any implicit
conversion.
2. A match involving an implicit conversion from float to double is
better than a match involving any other implicit conversion.
3. A match involving an implicit conversion from either int or uint to
float is better than a match involving an implicit conversion from
either int or uint to double.
If none of the rules above apply to a particular pair of conversions,
neither conversion is considered better than the other."
V3: Add spec citation, including oddball difference between gs5 and GLSL
4.0; comment a bit better as per Jordan's suggestions.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This will facilitate GLSL 4.0 / ARB_gpu_shader5's enhanced overload
resolution rules, and also possibly better error reporting for ambiguous
function calls.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
V2: Fix crashes during linking, where the parse state is NULL. In this
case, all required checks have already been done, so we assume the
extension is enabled.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The available implicit conversions depend on the GLSL version we're
compiling.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
We're about to add new implicit conversions, first for ARB_gpu_shader5,
and then later for ARB_gpu_shader_fp64. Pull out the opcode
determination into its own function, and get rid of the bool -> float
case that could never be hit anyway [since it fails the is_numeric()
check].
V2: Retain the vector width mangling. It turns out this is necessary for
the conversions done (and then thrown away) when determining the return
type of arithmetic operators.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This works like glsl-1.20+'s invariant redeclarations, but with fewer
restrictions, since `precise` is allowed on pretty much anything.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
There are several common ways to check whether an object is a particular
subclass: dynamic_cast<>, the as_subclass() pattern, or explicit enum
tags. We originally used the virtual as_subclass methods, but later
added enum tags as they are much nicer for debugging.
Since we have the enum tags, we don't necessarily need to use virtual
functions to implement the as_subclass() methods. We can just check the
tag and return the pointer or NULL.
This saves 18 entries in the vtable, and instead of two pointer
dereferences per as_subclass() call most are only three inline
instructions.
Compile time of sam3/112.frag (the longest compile in a recent shader-db
run) is reduced by 5% from 348 to 329 ms (n=500).
perf stat of this workload shows:
24.14% reduction in iTLB-loads: 285,543 -> 216,606
42.55% reduction in iTLB-load-misses: 18,785 -> 10,792
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Now that the constructors set a type, ir_type_unset is not very useful.
Move it to the end of the enum (specifically out of position 0) so that
enums checks for dereferences and rvalues can save an instruction.
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Makes checking whether an object is an ir_dereference, an ir_rvalue, or
an ir_jump simpler. Since ir_dereference is a subclass or ir_rvalue,
list its subtypes first so that they can both generate nice code.
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
This has the added perk that if you forget to set ir_type in the
constructor of a new subclass (or a new constructor of an existing
subclass) the compiler will tell you... instead of relying on
ir_validate or similar run-time detection.
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
to have _mesa_error_no_memory function available
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79440
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
So that prog_hash_table can use _mesa_error_no_memory function.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Check return value from hash_table_find before using it as a pointer
Signed-off-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
They were made unneccesary by the last commit.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This way, when someone modifies create_test_cases.py and forgets to
commit their changes again, people will notice.
v2: make sure we parse the right directories and check for existance the
right way.
v3 (Ken): Use $PYTHON2 instead of calling python directly.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
In 088494aa (as well as other commits in the series) Paul Berry modified
the tests for lower_jumps to account for the fact that the s-expression
for the loop IR instruction changed from
(loop () () () () (statements...)) to (loop (statements...)), but he
forgot to update create_test_cases.py which he used to create the tests.
Fix that, so that now create_test_cases.py is synced with the generated
tests.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Make sure that we print the same number of digits when printing 0.0 as
any other floating-point number. This will make generating expected
output files for tests easier. To avoid breaking "make check," update
the generated tests for lower_jumps before the next commit which will
bring create_test_cases.py in line with them.
Signed-off-by: Connor Abbott <cwabbott0@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The call to get_variable_being_redeclared() may delete 'var' so we
can't reference var->name afterward. We fix that by examining the
var's name before making that call.
Fixes valgrind warnings and possible crash when running the piglit
tests/spec/glsl-1.30/execution/clipping/vs-clip-distance-in-param.shader_test
test (and probably others).
Cc: "10.1 10.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The M_PI*f macros used a preprocessor paste to append 'f'
to M_PI defines, which works if the values are only numbers
but breaks on OpenBSD where M_PI definitions have casts
and brackets to meet requirements of a future version of POSIX,
http://austingroupbugs.net/view.php?id=801http://austingroupbugs.net/view.php?id=828
Simplify the M_PI*f macros by using casts directly in the defines
as suggested by Kenneth Graunke.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78665
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
That information misleads source code auditing tools to think that
ralloc itself is released under LGPL v3.
Instead, simply state talloc is not licensed under a permissive license.
v2: Use wording suggested by Kenneth.
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Both the ast->IR and linker have functions with this name, but different
behavior.
Rename the linker's version to var_counts_against_varying_limit to be
closer to what it is actually used for.
Suggested by Ian a while back.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
In an earlier incarnation of populate_consumer_input_sets and
get_matching_input, the consumer_inputs_with_locations array was indexed
using the user-specified location. In that version, only user-defined
varyings were included in the array.
In the current incarnation, the Mesa location is used to index the
array, and built-in varyings are included.
This change fixes the unit test to exepect gl_ClipDistance in the array,
and it resizes the arrays to actually be big enough. It's just dumb
luck that the existing piglit tests use small enough locations to not
stomp the stack. :(
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78258
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Cc: Vinson Lee <vlee@freedesktop.org>
This can be called from locations that don't have a context pointer
handy. This patch also adds enough infrastructure so that the unit
tests for the GLSL compiler and the stand-alone compiler will build and
function.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Juha-Pekka Heikkila <juhapekka.heikkila@gmail.com>
This allows them to be moved to .rodata, and allow us to be sure that they
will not be modified.
Signed-off-by: Chia-I Wu <olv@lunarg.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
This was a work-around to allow linking a program with only a fragment
shader in a GLES context. Now that we have GL_EXT_separate_shader_objects
in GLES contexts, we can just use that.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
ARB, OES, then everything else. If there's ever a KHR shading language
extension, it should go between ARB and OES.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Eric Anholt <eric@anholt.net>
I don't know of any applications that actually use it. Now that Mesa
supports GL_ARB_separate_shader_objects in all drivers, this extension
is just cruft.
The entrypoints for the extension remain in the XML. This is done so
that a new libGL will continue to provide dispatch support for old
drivers that try to expose this extension.
Future patches will add OpenGL ES GL_EXT_separate_shader_objects, but
that's a different thing.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
This will be used for GL_ARB_separate_shader_objects. That extension
not only allows separable shaders to rendezvous by location, but it also
allows traditionally linked shaders to rendezvous by location. The spec
says:
36. How does the behavior of input/output interface matching differ
between separable programs and non-separable programs?
RESOLVED: The rules for matching individual variables or block
members between stages are identical for separable and
non-separable programs, with one exception -- matching variables
of different type with the same location, as discussed in issue
34, applies only to separable programs.
However, the ability to enforce matching requirements differs
between program types. In non-separable programs, both sides of
an interface are contained in the same linked program. In this
case, if the linker detects a mismatch, it will generate a link
error.
v2: Make sure consumer_inputs_with_locations is initialized when
consumer is NULL. Noticed by Chia-I.
v3: Rebase on removal of ir_variable::user_location.
v4: Replace a (stale) FINISHME with some good explanation comments from
Eric.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
When linking a separable program that contains only a fragment shader,
the producer will be NULL. Similar cases will exist with geometry
shaders and, eventually, tessellation shaders.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
While writing the link_varyings::single_interface_input test, I
discovered that populate_consumer_input_sets assumes that all shader
interface blocks have been lowered to discrete variables. Since there
is a pass that does this, it is a reasonable assumption. It was,
however, non-obvious. Make the code fail when it encounters such a
thing, and add a test to verify that behavior.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Four initial tests:
* Create an IR list with a single input variable and verify that
variable is the only thing in the hash tables.
* Same as the previous test, but use a built-in variable
(gl_ClipDistance) with an explicit location set.
* Create an IR list with a single input variable from an interface block
and verify that variable is the only thing in the hash tables.
* Create an IR list with a single input variable and a single input
variable from an interface block. Verify that each is the only thing
in the proper hash tables.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
I want to make some changes to this code, but first I want to make some
unit tests for it... so that I can capture the pre- and
post-invariants. Pulling the code out into its own function in a
non-anonymous namespace enables that.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Don't do anything with variables that have explicitly assigned
locations. This is also how built-in varyings are handled.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
In February 2013 Paul unified the values used for shader stage outputs
and shader stage inputs. See commits 8a076c5f0^..eed6baf76. Since that
time, the location_base parameters are always VARYING_SLOT_VAR0.
Instead of passing that around, just hard code it.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Link error conditions added in previous patch are equally applicable
to GL_ARB_fragment_coord_conventions implementation. Extension's spec
says:
"If gl_FragCoord is redeclared in any fragment shader in a program,
it must be redeclared in all the fragment shaders in that program
that have a static use of gl_FragCoord. All redeclarations of
gl_FragCoord in all fragment shaders in a single program must have
the same set of qualifiers."
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
GLSL 1.50 spec says:
"If gl_FragCoord is redeclared in any fragment shader in a program,
it must be redeclared in all the fragment shaders in that
program that have a static use gl_FragCoord. All redeclarations of
gl_FragCoord in all fragment shaders in a single program must
have the same set of qualifiers."
This patch causes the shader link to fail if we have multiple fragment
shaders with conflicting layout qualifiers for gl_FragCoord.
V2: Restructure the code and add conditions to correctly handle the
following case:
fragment shader 1:
layout(origin_upper_left) in vec4 gl_FragCoord;
void main()
{
foo();
gl_FragColor = gl_FragData;
}
fragment shader 2:
layout(pixel_center_integer) in vec4 gl_FragCoord;
void foo()
{
}
V3:
Allow linking in the following case:
fragment shader 1:
void main()
{
foo();
gl_FragColor = gl_FragCoord;
}
fragment shader 2:
in vec4 gl_FragCoord;
void foo()
{
...
}
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Section 4.3.8.1, page 39 of GLSL 1.50 spec says:
"Within any shader, the first redeclarations of gl_FragCoord
must appear before any use of gl_FragCoord."
GLSL compiler should generate an error in following case:
vec4 p = gl_FragCoord;
layout(origin_upper_left) in vec4 gl_FragCoord;
void main()
{
}
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
GLSL 1.50 spec says:
"If gl_FragCoord is redeclared in any fragment shader in a program,
it must be redeclared in all the fragment shaders in that
program that have a static use gl_FragCoord. All redeclarations of
gl_FragCoord in all fragment shaders in a single program must
have the same set of qualifiers."
This patch makes the glsl compiler to generate an error if we have a
fragment shader defined with conflicting layout qualifier declarations
for gl_FragCoord. For example:
layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord;
layout(pixel_center_integer) in vec4 gl_FragCoord;
void main()
{
}
V2: Some code refactoring for better readability.
Add compiler error conditions for redeclarations like:
layout(origin_upper_left) in vec4 gl_FragCoord;
layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord;
and
in vec4 gl_FragCoord;
layout(origin_upper_left, pixel_center_integer) in vec4 gl_FragCoord;
V3: Simplify function is_conflicting_fragcoord_redeclaration()
V4: Check for null pointer before doing strcmp(var->name, "gl_FragCoord").
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Currently overlapping locations of input variables are not allowed for all
the shader types in OpenGL and OpenGL ES.
From OpenGL ES 3.0 spec, page 56:
"Binding more than one attribute name to the same location is referred
to as aliasing, and is not permitted in OpenGL ES Shading Language
3.00 vertex shaders. LinkProgram will fail when this condition exists.
However, aliasing is possible in OpenGL ES Shading Language 1.00 vertex
shaders."
Taking in to account what different versions of OpenGL and OpenGL ES specs
say about aliasing:
- It is allowed only on vertex shader input attributes in OpenGL (2.0 and
above) and OpenGL ES 2.0.
- It is explictly disallowed in OpenGL ES 3.0.
Fixes Khronos CTS failing test:
explicit_attrib_location_vertex_input_aliased.test
See more details about this at below mentioned khronos bug.
V2: Fix the case where location exceeds the maximum allowed attribute
location.
V3: Simplify the condition added in V2.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Bugzilla: Khronos #9609
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
bitfieldInsert takes scalar integers for its last two arguments. Since
bitfieldInsert is lowered on i965 to two instructions that have more
flexible arguments, I didn't notice when I wrote this.
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Previously this was special-cased for VS and FS; it never got updated
when geometry shaders came along. Generalize using is_varying_var() so
this won't be broken again with tessellation.
Note that there are two copies of the logic for `invariant`: It can be
present as part of a new declaration, and also as a redeclaration of an
existing variable or block member.
Fixes the four new piglits:
spec/glsl-1.50/compiler/invariant-qualifier-*.geom
Note for stable: This won't quite pick cleanly due to whitespace and
state->target -> state->stage renames. Should be straightforward
adjustments though.
Cc: "10.0 10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
As of 943b2d52bf, layout(binding) on an atomic would fail the assertion
here.
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Once the relevant branch has been identified do not iterate over the
instructions in the branch, do a linked list insertion instead to avoid the
loop.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Currently we can have name space collisions between blocks that define the same
fields. For example:
in block
{
vec4 Color;
} In[];
out block
{
vec4 Color;
} Out;
These two blocks will assign the same interface name (block.Color) to the Color
field in flatten_named_interface_blocks_declarations.cpp, leading to havoc.
This was breaking badly the gl-320-primitive-shading test from ogl-samples.
The patch uses the block instance name to avoid collisions, producing names
like block.In.Color and block.Out.Color to avoid the name clash.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76394
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Many shaders use a pattern such as:
for (int i = 0; i < NUM_LIGHTS; i++) {
...access a uniform array, or shader input/output array...
}
where NUM_LIGHTS is a small constant (such as 2, 4, or 8).
The expectation is that the compiler will unroll those loops, turning
the array access into constant indexing, which is more efficient, and
which may enable array splitting and other optimizations.
In many cases, our heuristic fails - either there's another tiny nested
loop inside, or the estimated number of instructions is just barely
beyond the threshold. So, we fail to unroll the loop, leaving the
variable indexing in place.
Drivers which don't support the particular flavor of variable indexing
will call lower_variable_index_to_cond_assign(), which generates piles
and piles of immensely inefficient code. We'd like to avoid generating
that.
This patch detects unsupported forms of variable-indexing in loops, where
the array index is a loop induction variable. In that case, it bypasses
the loop-too-large heuristic and forces unrolling.
Improves performance in various microbenchmarks: Gl32PSBump8 by 47%,
Gl32ShMapVsm by 80%, and Gl32ShMapPcf by 27%. No changes in shader-db.
v2: Check ir->array for being an array or matrix, rather than the
ir_dereference_array itself.
v3: Fix and expand statistics in commit message.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The "fail" flag is set if loop_unroll_count encounters a nested loop;
calling the flag "nested_loop" is a bit clearer.
The original reasoning was that count is inaccurate (too small) if there
are nested loops, as we don't do any sort of analysis on the inner loop.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Loop unrolling will need to know a few more options in the future.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
Now that we pass in gl_shader_compiler_options, it makes sense to just
use options->MaxUnrollIterations, rather than passing a separate
parameter.
Half of the invocations already passed options->MaxUnrollIterations,
while the other half passed in a hardcoded value of 32.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
When considering assignment expressions like:
v.x += u.x;
v.x += u.x;
the vectorizer would incorrectly keep going, attempting to find more
instructions to vectorize. It would overwrite the saved assignment
to point at the second one, and increment channels a second time,
resulting in try_vectorize thinking the expression was a vec2 instead of
a float.
Instead, if we see a repeated assignment to a channel, just try to
vectorize everything we've found so far. This clears the saved state
so it will start over.
Fixes Piglit's repeated-channel-assignments.vert.
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Information about the binding was not being properly communicated from
the front-end compiler to the linker. As a result, the linker never
knew that any UBOs had explicit bindings!
Fixes the piglit test arb_shading_language_420pack-binding-layout.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: github@socker.lepus.uberspace.de [v0]
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
Previously, a UBO like
layout(binding=2) uniform U {
...
} my_constants[4];
wouldn't get any bindings set. The code would try to set the binding of
U, but that would fail. It should instead set the bindings for U[0],
U[1], ...
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
For blocks, gl_shader_program::UniformStorage isn't very useful. The
names stored there are the names of the elements of the block, so
finding blocks with an instance name is hard. There is also only one
entry in ::UniformStorage for each element of a block array, and that is
a deal breaker.
Using ::UniformBlocks is what _mesa_GetUniformBlockIndex does. I
contemplated sharing code between set_block_binding and
_mesa_GetUniformBlockIndex, but building the stand-alone compiler and
the unit tests make this hard. I plan to return to this effort shortly.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
In the next patch, we'll see that using
gl_shader_program::UniformStorage is not correct for uniform blocks.
That means we can't use ::UniformStorage to select between the sampler
path and the block path. Instead we want to just use the type of the
variable. That's never passed to set_uniform_binding, and it's easier
to just remove the function (especially for later patches in the series)
than to add another parameter.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
- Remove the spurious block left from the previous commit and re-indent.
- Constify elements.
- Make the spec reference in the code look like other spec references in
the compiler.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
The two code paths are quite different, and there are some problems in
the handling of uniform blocks. Future changes will cause these paths
to diverge further. Ultimately, selecting between the two functions
will happen at the set_uniform_binding call site, and
set_uniform_binding will be deleted.
NOTE: This patch just moves code around.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76323
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Cc: github@socker.lepus.uberspace.de
While we wish our optimization passes could identify all the cases where
we can coalesce our variables, we miss out on a lot of opportunities.
total instructions in shared programs: 1673849 -> 1673166 (-0.04%)
instructions in affected programs: 299521 -> 298838 (-0.23%)
GAINED: 7
LOST: 0
Note that many programs are "hurt". The notable ones are where we produce
unrolling in cases we didn't before (presumably just because of the lower
instruction count). But there are also some cases where pushing things
right into the variables prevents copy propagation and tree grafting,
since we don't split our variable usage webs apart.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The next patch will introduce an optimization that only works when
integers are not represented as floating point values.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The next few patches will introduce an optimization that only works when
integers are not represented as floating point values.
v2: Re-word-wrap a line, as requested by Ian Romanick.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The IR is not supposed to support implicit type conversions; we just
failed to validate it.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
ir_binop_ubo_load takes unsigned integer operands. However, the array
index used to compute these offsets may be a signed integer. (For
example, see Piglit's spec/glsl-1.40/uniform_buffer/fs-bvec-array).
For some reason, we were missing an ir_binop_i2u cast, and ir_validator
was failing to catch that.
Without this change, ir_builder's type inference code broke for me when
writing a new optimization pass.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The i965 MUL instruction doesn't natively support 32-bit by 32-bit
integer multiplication; additional instructions (MACH/MOV) are required.
However, we can avoid those if we know one of the operands can be
represented in 16 bits or less. The vector backend's is_16bit_constant
static helper function checks for this.
We want to be able to use it in the scalar backend as well, which means
moving the function to a more generally-usable location. Since it isn't
i965 specific, I decided to make it an ir_constant method, in case it
ends up being useful to other people as well.
v2: Rename from is_16bit_integer_constant to is_uint16_constant, as
suggested by Ilia Mirkin. Update comments to clarify that it does
apply to both int and uint types, as long as the value is
non-negative and fits in 16-bits.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cuts a small handful of instructions in Serious Sam 3:
instructions in affected programs: 4692 -> 4666 (-0.55%)
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
This is the closing for the "\defgroup IR Intermediate representation
nodes" all the way at the top of the file.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
These could probably be squashed into one of the previous commits.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
All of the functionality is implemented in a private function in the one
file where it is used.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
In all uses of dotlike() we're writing generic code that operates on 1-4
component vectors. That our IR requires ir_binop_dot expressions'
operands to be 2+ component vectors is an implementation detail that's
not important when implementing built-in functions with dot(), which is
defined for scalar floats in GLSL.
Reviewed-by: Eric Anholt <eric@anholt.net>
ARB_gpu_shader5 and ES 3.0 expose different subsets of
ARB_shading_language_packing.
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Fixes the following build error on OpenBSD:
./.libs/libglsl.a(builtin_functions.o)(.text+0x973): In function `mtx_lock':
../../include/c11/threads_posix.h:195: undefined reference to `pthread_mutex_lock'
./.libs/libglsl.a(builtin_functions.o)(.text+0x9a5): In function `mtx_unlock':
../../include/c11/threads_posix.h:248: undefined reference to `pthread_mutex_unlock'
Signed-off-by: Jonathan Gray <jsg@jsg.id.au>
Reviewed-by: Brian Paul <brianp@vmware.com>
There is little gain in printing whenever a folder is created.
v2:
- Use $(AM_V_at) over @ to have control in verbose builds.
Suggested by Erik Faye-Lund.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Patch adds a remap table for uniforms that is used to provide a mapping
from application specified uniform location to actual location in the
UniformStorage. Existing UniformLocationBaseScale usage is removed as
table can be used to set sequential values for array uniform elements.
This mapping helps to implement GL_ARB_explicit_uniform_location so that
uniforms locations can be reorganized and handled in a more easy manner.
v2: small fixes + rename parameters for merge and split functions (Ian)
improve documentation, remove old check for location bounds (Eric)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
After preprocessing by glcpp all adjacent spaces were replaced by
single one and glsl parser received column-shifted shader source.
It negatively affected ast location set up and produced wrong error
messages for heavily-spaced shaders.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
With a non-debug build, gcc has two complaints:
1. 'found' var not used. Silence with '(void) found;'
2. 'id' not initialized. It's assigned by the UniformHash->get()
call, actually. But init it to zero to silence gcc.
Reviewed-by: Matt Turner <mattst88@gmail.com>
opt_algebraic was translating lrp(x, 0, a) into add(x, -mul(x, a)).
Unfortunately, this references "x" twice, which is invalid in the IR,
leading to assertion failures in the validator.
Normally, cloning IR solves this. However, "x" could actually be an
arbitrary expression tree, so copying it could result in huge piles
of wasted computation. This is why we avoid reusing subexpressions.
Instead, transform it into mul(x, add(1.0, -a)), which is equivalent
but doesn't need two references to "x".
Fixes a regression since d5fa8a9562, which isn't in any stable
branches. Fixes 18 shaders in shader-db (bastion and yofrankie).
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Tt's kind of a trap---calling do_common_optimization() after
lower_instructions() may cause opt_algebraic() to reintroduce
ir_triop_lrp expressions that were lowered, effectively defeating the
point. Because of this, nobody uses it.
v2: Delete more code (caught by Ian Romanick).
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Acked-by: Eric Anholt <eric@anholt.net>
This variable is no longer needed after the cleanup to the
code prior to the first arrays of array series
Signed-off-by: Timothy Arceri <t_arceri@yahoo.com.au>
Reviewed-by: Matt Turner <mattst88@gmail.com>
This lowering pass will be useful for gallium drivers as well, in order to support
the GL TG4 oddity that is textureGatherOffsets.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Signed-off-by: Dave Airlie <airlied@redhat.com>
While we want to be able to print to stdout for glsl_compiler, for
debugging drivers we want to be able to dump to stderr because that's
where other driver debug (like LIBGL_DEBUG) tends to go, and because some
apps actually close stdout to shut up their own messages (such as the X
Server, or NWN).
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
GL_ARB_separate_shader_objects adds the ability to specify location
layouts for interstage inputs and outputs.
In addition, this extension makes 'in' and 'out' generally available for
shader inputs and outputs. This mimics the behavior of
GL_ARB_explicit_attrib_location.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This adds the necessary bits for both the API and the GLSL compiler.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
v2:
* Make gl_InvocationID a system value
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Grab the parsed invocation count, check for consistency
during linking, and finally save the result in
gl_shader_program Geom.Invocations.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
_mesa_glsl_parse_state in_qualifier->invocations will store the
invocations count.
v3:
* Use in_qualifier to allow the primitive to be specied
separately from the invocations count (merge_qualifiers)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
We introduce a new merge_in_qualifier ast_type_qualifier
which allows specialized handling of merging input layout
qualifiers.
By merging layout qualifiers into state->in_qualifier, we
allow multiple input qualifiers. For example, the primitive
type can be specified specified separately from the
invocations count (ARB_gpu_shader5).
state->gs_input_prim_type is moved into state->in_qualifier->prim_type
state->gs_input_prim_type_specified is still processed separately
so we can determine when the input primitive is specified. This
is important since certain scenerios are not supported until after
the primitive type has been specified in the shader code.
v4:
* Merge with compute shader input layout qualifiers
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
The const in
const unsigned foo(void);
is meaningless. Removing it silences this warning:
src/glsl/ast_to_hir.cpp:1802:56: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
From page 14 (page 20 of the PDF) of the GLSL 1.10 spec:
"In addition, all identifiers containing two consecutive underscores
(__) are reserved as possible future keywords."
The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos. Names simply containing __ are dangerous to use, but should
be allowed.
Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
Section 3.3 (Preprocessor) of the GLSL 1.30 spec (and later) and the
GLSL ES spec (all versions) say:
"All macro names containing two consecutive underscores ( __ ) are
reserved for future use as predefined macro names. All macro names
prefixed with "GL_" ("GL" followed by a single underscore) are also
reserved."
The intention is that names containing __ are reserved for internal use
by the implementation, and names prefixed with GL_ are reserved for use
by Khronos. Since every extension adds a name prefixed with GL_ (i.e.,
the name of the extension), that should be an error. Names simply
containing __ are dangerous to use, but should be allowed. In similar
cases, the C++ preprocessor specification says, "no diagnostic is
required."
Per the Khronos bug mentioned below, a future version of the GLSL
specification will clarify this.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Cc: "9.2 10.0 10.1" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Tested-by: Darius Spitznagel <d.spitznagel@goodbytez.de>
Cc: Tapani Pälli <lemody@gmail.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=71870
Bugzilla: Khronos #11702
GL_ARB_ES2_compatibility doesn't say anything about shader linking
when one of the shaders (vertex or fragment shader) is absent. So,
the extension shouldn't change the behavior specified in GLSL
specification.
Tested the behavior on proprietary linux drivers of NVIDIA and AMD.
Both of them allow linking a version 100 shader program in OpenGL
context, when one of the shaders is absent.
Makes following Khronos CTS tests to pass:
successfulcompilevert_linkprogram.test
successfulcompilefrag_linkprogram.test
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
To fix MSVC compile breakage. Evidently, _restrict is an MSVC keyword,
though the docs only mention __restrict (with two underscores).
Note: we may want to also rename _volatile to volatile_flag to be
consistent.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74900
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Because of the combinatorial explosion of different image built-ins
with different image dimensionalities and base data types, enumerating
all the 242 possibilities would be annoying and a waste of .text
space. Instead use a special path in the built-in builder that loops
over all the known image types.
v2: Generate built-ins on GLSL version 4.20 too. Rename
'_has_float_data_type' to '_supports_float_data_type'. Avoid
duplicating enumeration of image built-ins in create_intrinsics()
and create_builtins().
v3: Use a more orthodox approach for passing image built-in generator
parameters.
v4: Cosmetic changes.
Acked-by: Paul Berry <stereotype441@gmail.com>
No opaque types may be statically initialized in the shader, all
opaque variables must be declared uniform or be part of an "in"
function parameter declaration, no opaque types may be used as the
return type of a function.
v2: Add explicit check for opaque types in interface blocks. Check
for opaque types in ir_dereference::is_lvalue().
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Aggregating images inside uniform blocks is explicitly disallowed by
the standard, aggregating them inside structures is not (as of GL
4.4), but there is a similar problem as with atomic counters: image
uniform declarations require either a "writeonly" memory qualifier or
an explicit format qualifier, which are explicitly forbidden in
structure member declarations. In the resolution of Khronos bug
#10903 the same wording applied to atomic counters was decided to mean
that they're not allowed inside structures -- Rejecting image member
declarations within structures seems the most reasonable option for
now.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
v2: Add comment next to the read_only and write_only qualifier flags.
Change temporary copies of the type qualifier mask to use uint64_t
too.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Add predicates to query if a GLSL type is or contains an image.
Rename sampler_coordinate_components() to coordinate_components().
v2: Use assert instead of unreachable.
v3: No need to use a separate code-path for images in
coordinate_components() after merging image and sampler fields in
the glsl_type structure.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
v2: Reuse the glsl_sampler_dim enum for images. Reuse the
glsl_type::sampler_* fields instead of creating new ones specific
to image types. Reuse the same constructor as for samplers adding
a new 'base_type' argument.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Array dereferences must have scalar indices, so we cannot vectorize
them.
Cc: "10.1" <mesa-stable@lists.freedesktop.org>
Reported-by: Andrew Guertin <lists@dolphinling.net>
Tested-by: Andrew Guertin <lists@dolphinling.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Consider a multithreaded program with two contexts A and B, and the
following scenario:
1. Context A calls initialize(), which allocates mem_ctx and starts
building built-ins.
2. Context B calls initialize(), which sees mem_ctx != NULL and assumes
everything is already set up. It returns.
3. Context B calls find(), which fails to find the built-in since it
hasn't been created yet.
4. Context A finally finishes initializing the built-ins.
This will break at step 3. Adding a lock ensures that subsequent
callers of initialize() will wait until initialization is actually
complete.
Similarly, if any thread calls release while another thread is still
initializing, or calling find(), the mem_ctx/shader would get free'd while
from under it, leading to corruption or use-after-free crashes.
Fixes sporadic failures in Piglit's glx-multithread-shader-compile.
Bugzilla: https://bugs.freedesktop.org/69200
Signed-off-by: Daniel Kurtz <djkurtz@chromium.org>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.1 10.0" <mesa-stable@lists.freedesktop.org>
v2: Fix pasteo of an extra abs being inserted (caught by many). Rewrite
to drop the silly switch statement.
Reviewed-by: Matt Turner <mattst88@gmail.com> (v1)
Mesa fails to retain the precision qualifier when parsing:
#version 300 es
centroid in mediump vec2 v;
Consider how the parser's type_qualifier production is applied.
First, the precision_qualifier rule creates a new ast_type_qualifier:
<precision: mediump>
Then the storage_qualifier rule creates a second one:
<flags: in>
and calls merge_qualifier() to fold in any previous qualifications,
returning:
<flags: in, precision: mediump>
Finally, the auxiliary_storage_qualifier creates one for "centroid":
<flags: centroid>
it then does $$ = $1 and $$.flags |= $2.flags, resulting in:
<flags: centroid, in>
Since precision isn't stored in the flags bitfield, it is lost. We need
to instead call merge_qualifier to combine all the fields.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reported-by: Kevin Rogovin <kevin.rogovin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
v2: Document that the 3-element array MaxComputeWorkGroupCount is
indexed by dimension.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
v2: Document that the 3-element array MaxComputeWorkGroupSize is
indexed by dimension.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This patch adds MESA_SHADER_COMPUTE to the gl_shader_stage enum.
Also, where it is trivial to do so, it adds a compute shader case to
switch statements that switch based on the type of shader. This
avoids "unhandled switch case" compiler warnings.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Linker loops that iterate through all the stages in the pipeline need
to use MESA_SHADER_FRAGMENT as a bound, so that we can add an
additional MESA_SHADER_COMPUTE stage, without it being erroneously
included in the pipeline.
Reviewed-by: Matt Turner <mattst88@gmail.com>
From the GLSL 4.40 spec, section 6.4 (Jumps):
The continue jump is used only in loops. It skips the remainder of
the body of the inner most loop of which it is inside. For while
and do-while loops, this jump is to the next evaluation of the
loop condition-expression from which the loop continues as
previously defined.
Previously, we incorrectly treated a "continue" statement as jumping
to the top of a do-while loop.
This patch fixes the problem by replicating the loop condition when
converting the "continue" statement to IR. (We already do a similar
thing in "for" loops, to ensure that "continue" causes the loop
expression to be executed).
Fixes piglit tests:
- glsl-fs-continue-inside-do-while.shader_test
- glsl-vs-continue-inside-do-while.shader_test
- glsl-fs-continue-in-switch-in-do-while.shader_test
- glsl-vs-continue-in-switch-in-do-while.shader_test
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
In addition to making it public, we also need to change its first
argument from an ir_loop * to an exec_list *, so that it can be used
to insert the condition anywhere in the IR (rather than just in the
body of the loop).
This will be necessary in order to make continue statements work
properly in do-while loops.
Cc: mesa-stable@lists.freedesktop.org
Acked-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
The -p option we now use when calling bison means that this variable will be
named glcpp_parser_debug not yydebug. This was not caught when the -p option
was added because this variable isn't used in the code as committed. (I prefer
the declaration to remain since it allows a developer to easily find this
variable name to enable debugging.)
This is the innocent-looking but killer test case to verify the bug fixed in
the preceding commit.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
In commit 6005e9cb28 a new start state of NEWLINE_CATCHUP was added to the
lexer. This start state is used whenever the lexer is emitting a NEWLINE token
to emit additional NEWLINE tokens for any newline characters that were skipped
by an immediately preceding multi-line comment.
However, that commit erroneously entered the NEWLINE_CATCHUP state for
single-line comments. This is not desired since in the case of a single-line
comment, the lexer is not emitting any NEWLINE token. The result is that the
lexer will remain in the NEWLINE_CATCHUP state and proceed to fail to emit a
NEWLINE token for the subsequent newline character, (since the case to match \n expects only the INITIAL start state).
The fix is quite simple, remove the "BEGIN NEWLINE_CATCHUP" code from the
single-line comment case, (preserving it only in exactly the cases where the
lexer is actually emitting a NEWLINE token).
Many thanks to Petri Latvala for reporting this bug and for providing the
minimal test case to exercise it. The bug showed up only with a multi-line
comment which was followed immediately by a single-line comment (without any
intervening newline), such as:
/*
*/ // Kablam!
Since 6005e9cb28, and before this commit, that very innocent-looking
combination of comments would yield a parse failure in the compiler.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72686
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
The former symbol is never defined within mesa. Based on the code
it seems that the original intent was to use NDEBUG.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Previously, for example if the x channel was missing from a series of
assignments we were attempting to vectorize, the wrong swizzle mask
would be applied.
a.y = b.y;
a.z = b.z;
a.w = b.w;
would be incorrectly transformed into
a.yzw = b.xyz;
Fixes two transform feedback tests in the ES3 conformance suite.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73978
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73954
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>