These functions don't modify the target instruction, so it makes sense
to make them const. This allows these functions to be called from ir
validation code (which uses const to ensure that it doesn't
accidentally modify the IR being validated).
Reviewed-by: Chad Versace <chad@chad-versace.us>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When an out parameter undergoes an implicit type conversion, we need
to store it in a temporary, and then after the call completes, convert
the resulting value. In other words, we convert code like the
following:
void f(out int x);
float value;
f(value);
Into IR that's equivalent to this:
void f(out int x);
float value;
int out_parameter_conversion;
f(out_parameter_conversion);
value = float(out_parameter_conversion);
This transformation needs to happen during ast-to-IR convertion (as
opposed to, say, a lowering pass), because it is invalid IR for formal
and actual parameters to have types that don't match.
Fixes piglit tests
spec/glsl-1.20/compiler/qualifiers/out-conversion-int-to-float.vert and
spec/glsl-1.20/execution/qualifiers/vs-out-conversion-*.shader_test,
and bug 39651.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39651
Reviewed-by: Chad Versace <chad@chad-versace.us>
libGLw is an old OpenGL widget library with optional Motif support.
It almost never changes and very few people actually still care about
it, so we've decided to ship it separately.
The new home for libGLw is: git://git.freedesktop.org/mesa/glw/
Reviewed-by: Brian Paul <brianp@vmware.com>
Previously if-statements were lowered from inner-most to outer-most
(i.e., bottom-up). All assignments within an if-statement would have
the condition of the if-statement appended to its existing condition.
As a result the assignments from a deeply nested if-statement would
have a very long and complex condition.
Several shaders in the OpenGL ES2 conformance test suite contain
non-constant array indexing that has been lowered by the shader
writer. These tests usually look something like:
if (i == 0) {
value = array[0];
} else if (i == 1) {
value = array[1];
} else ...
The IR for the last assignment ends up as:
(assign (expression bool && (expression bool ! (var_ref if_to_cond_assign_condition) ) (expression bool && (expression bool ! (var_ref if_to_cond_assign_condition@20) ) (expression bool && (expression bool ! (var_ref if_to_cond_assign_condition@22) ) (expression bool && (expression bool ! (var_ref if_to_cond_assign_condition@24) ) (var_ref if_to_cond_assign_condition@26) ) ) ) ) (x) (var_ref value) (array_ref (var_ref array) (constant int (5)))
The Mesa IR that is generated from this is just as awesome as you
might expect.
Three changes are made to the way if-statements are lowered.
1. Two condition variables, if_to_cond_assign_then and
if_to_cond_assign_else, are created for each if-then-else structure.
The former contains the "positive" condition, and the later contains
the "negative" condtion. This change was implemented in the previous
patch.
2. Each condition variable is added to a hash-table when it is created.
3. When lowering an if-statement, assignments to existing condtion
variables get the current condition anded. This ensures that nested
condition variables are only set to true when the condition variable
for all outer if-statements is also true.
Changes #1 and #3 combine to ensure the correctness of the resulting
code.
4. When a condition assignment is encountered with a condition that is
a dereference of a previously added condition variable, the condition
is not modified.
Change #4 prevents the continuous accumulation of conditions on
assignments.
If the original if-statements were:
if (x) {
if (a && b && c && d && e) {
...
} else {
...
}
} else {
if (g && h && i && j && k) {
...
} else {
...
}
}
The lowered code will be
if_to_cond_assign_then@1 = x;
if_to_cond_assign_then@2 = a && b && c && d && e
&& if_to_cond_assign_then@1;
...
if_to_cond_assign_else@2 = !if_to_cond_assign_then
&& if_to_cond_assign_then@1;
...
if_to_cond_assign_else@1 = !if_to_cond_assign_then@1;
if_to_cond_assign_then@3 = g && h && i && j;
&& if_to_cond_assign_else@1;
...
if_to_cond_assign_else@3 = !if_to_cond_assign_then
&& if_to_cond_assign_else@1;
...
Depending on how instructions are emitted, there may be an extra
instruction due to the duplication of the '&&
if_to_cond_assign_{then,else}@1' on the nested else conditions. In
addition, this may cause some unnecessary register pressure since in
the simple case (where the nested conditions are not complex) the
nested then-condition variables are live longer than strictly
necessary.
Before this change, one of the shaders in the OpenGL ES2 conformance
test suite's acos_float_frag_xvary generated 348 Mesa IR instructions.
After this change it only generates 124. Many, but not all, of these
instructions would have also been eliminated by CSE.
Reviewed-by: Eric Anholt <eric@anholt.net>
Now the condition (for the then-clause) and the inverse condition (for
the else-clause) get written to separate temporary variables. In the
presence of complex conditions, this shouldn't result in more code
being generated. If the original if-statement was
if (a && b && c && d && e) {
...
} else {
...
}
The lowered code will be
if_to_cond_assign_then = a && b && c && d && e;
...
if_to_cond_assign_else = !if_to_cond_assign_then;
...
Reviewed-by: Eric Anholt <eric@anholt.net>
EGL doesnt define howto manage different native platforms.
So mesa has a builtime configurable default platform,
whith non-standard envvar (EGL_PLATFORM) overwrites.
This caused unneeded bugreports, when EGL_PLATFORM was forgotten.
Detection is grouped into basic types of NativeDisplays (which itself
needs to be detected). The final decision is based on characteristcs
of these basic types:
File Desciptor based platforms (fbdev):
- fstat(2) to check for being a fd that belongs to a character device
- check kernel subsystem (todo)
Pointer to structuctures (x11, wayland, drm/gbm):
- mincore(2) to check whether its valid pointer to some memory.
- magic elements (e.g. pointers to exported symbols):
o wayland display stores interface type pointer (first elm.)
o gbm stores pointer to its constructor (first elm.)
o x11 as a fallback (FIXME?)
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
GLESv1 and GLESv2 have their own dispatch.h and remap_helper.h. These
headers are only used by api_exec_es1.c and api_exec_es2.c in core mesa.
Move the rules to generate them from glapi to core mesa.
Reviewed-by: Brian Paul <brianp@vmware.com>
[olv: updated after reviewing to fix SCons build]
glapi_gen.mk is supposed to be included by glapi users to simplify
header generation. This commit also makes es1api, es2api, and
shared-glapi use it.
Reviewed-by: Brian Paul <brianp@vmware.com>
[olv: updated after reviewing to prefix all variables in glapi_gen.mk by
glapi_gen]
glapi/gen-es/ defines two sets of GLAPI XMLs for OpenGL ES 1.1
(es1_API.xml) and 2.0 (es2_API.xml) respectively. They are used to
generate dispatch.h and remap_helper.h for GLES. Together with
gl_and_es_API.xml, we have to maintain three sets of GLAPI XMLs.
This commit makes dispatch.h and remap_helper.h for GLES be generated
from gl_and_es_API.xml.
Reviewed-by: Brian Paul <brianp@vmware.com>
add gl_api::filter_functions and gl_function::filter_entry_points to
filter out unwanted functions and entry points.
Reviewed-by: Brian Paul <brianp@vmware.com>
Move the list of entry points belong to GLES from mapi_abi.py to a new
file.
Until we figure out how to describe the APIs an entry point belongs to
in the XML file, and how to handle the case where an entry point others
alias is missing in some APIs, this is an easier solution than
maintaining another two sets of XMLs in glapi/gen-es/.
Reviewed-by: Brian Paul <brianp@vmware.com>
Remove the 'f' suffix from a float literal.
- .float 0.0f+1.0
+ .float 1.0
This fixes the following compile error with clang:
error: unexpected token in directive
.float 0.0f+1.0
^
Note: This is a candidate for the stable branches.
Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
Signed-off-by: Chad Versace <chad@chad-versace.us>
Optional parallel rendering of spans using OpenMP.
Initial implementation for aa triangles. A new option for scons is
also provided to activate the openmp support (off by default).
Signed-off-by: Brian Paul <brianp@vmware.com>
After copy buffer on preGEN6, it is necessary to wait for the blit to
complete before returning data to the user.
This should fix the piglit test: copy_buffer_coherency (pre-GEN6).
Signed-off-by: Ben Widawsky <ben@bwidawsk.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
"reg" was set in only one case, virtual GRFs pre register allocation,
and would be unset and have hw_reg set after allocation. Since we
never bothered with looking at virtual GRF number after allocation
anyway, just use the same storage and avoid confusion.
Besides separating out a logical step of the giant register allocator
function, this now communicates a bunch of the allocator information
through entries in brw_context, which will make this code partially
reusable for caching the expensive allocator setup.
It's fewer pointers to track, and when we start caching the register
set, should be algorithmically better in the cache hit case (lookup in
a byte-per-register array, instead of a linear walk through
desctiption of register classes to find how to translate that class).
This was a debugging aid at one point -- virtual grf 0 should never be
allocated, and it would be used if undefined register access occurred
in codegen. However, it made the confusing register allocation code
even more confusing by indexing things off of 1 all over.
At least one of the invariants verified by IR validation concerns the
relative ordering of toplevel constructs in the IR: references to
global variables must come after the declarations of those global
variables.
Since linking affects the ordering of toplevel constructs in the IR,
it's possible that a bug in the linker will cause invalid IR to be
generated, even if all the pre-linked shaders are valid. (In fact,
such a bug was fixed by the previous commit.)
Bugs like this are easily masked by further optimization passes,
particularly inlining. So to make them easier to track down, this
patch addes an IR validation step right after linking, and before
final optimization occurs. The validation only occurs on debug
builds.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
When link_functions.cpp adds a new function to the final linked
program, it needs to add it after any global variable declarations
that the function refers to, otherwise the IR will be invalid (because
variable declarations must occur before variable accesses). The
easiest way to do that is to have the linker emit functions to the
tail of the final linked program.
The linker used to emit functions to the head of the final linked
program, in an effort to keep callees sorted before their callers.
However, this was not reliable: it didn't work for functions declared
or defined in the same compilation unit as main, for diamond-shaped
patterns in the call graph, or for some obscure cases involving
overloaded functions. And no code currently relies on this sort
order.
No Piglit regressions with i965 Ironlake.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
process_array_type() contains an assertion to verify that no IR
instructions are generated while processing the expression that
specifies the size of the array. This assertion needs to happen
_after_ checking whether the expression is constant. Otherwise we may
crash on an illegal shader rather than reporting an error.
Fixes piglit tests array-size-non-builtin-function.vert and
array-size-with-side-effect.vert.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Rearranged the logic for converting the ast for a function call to
hir, so that we constant fold before emitting any IR. Previously we
would emit some IR, and then only later detect whether we could
constant fold. The unnecessary IR would usually get cleaned up by a
later optimization step, however in the case of a builtin function
being used to compute an array size, it was causing an assertion.
Fixes Piglit test array-size-constant-relational.vert.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38625
The ast-to-hir conversion needs to emit function signatures in two
circumstances: when a function declaration (or definition) is
encountered, and when a built-in function is encountered.
To avoid emitting a function signature in an illegal place (such as
inside a function), emit_function() checked whether we were inside a
function definition, and if so, emitted the signature before the
function definition.
However, this didn't cover the case of emitting function signatures
for built-in functions when those built-in functions are called from
inside the constant integer expression that specifies the length of a
global array. This failed because when processing an array length, we
are emitting IR into a dummy exec_list (see process_array_type() in
ast_to_hir.cpp). process_array_type() later checks (via an assertion)
that no instructions were emitted to the dummy exec_list, based on the
reasonable assumption that we shouldn't need to emit instructions to
calculate the value of a constant.
This patch changes emit_function() so that it emits function
signatures at toplevel in all cases.
This partially fixes bug 38625
(https://bugs.freedesktop.org/show_bug.cgi?id=38625). The remainder
of the fix is in the patch that follows.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
opt_dead_functions contained a shortcut to skip processing the first
function's body, based on the assumption that IR functions are
topologically sorted, with callees always coming before their callers
(therefore the first function cannot contain any calls).
This assumption turns out not to be true in general. For example, the
following code snippet gets translated to IR that violates this
assumption:
void f();
void g();
void f() { g(); }
void g() { ... }
In practice, the shortcut didn't cause bugs because of a coincidence
of the circumstances in which opt_dead_functions is called:
(a) we do inlining right before dead function elimination, and
inlining (when successful) eliminates all calls.
(b) for user-defined functions, inlining is always successful, because
previous optimization passes (during compilation) have reduced
them to a form that is eligible for inlining.
(c) the function that appears first in the IR can't possibly call a
built-in function, because built-in functions are always emitted
before the function that calls them.
It seems unnecessarily fragile to have opt_dead_functions depend on
these coincidences. And the next patch in this series will break (c).
So I'm reverting the shortcut. The consequence will be a slight
increase in link time for complex shaders.
This reverts commit c75427f4c8.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
This reverts an unnecessary part of commit 4683529048 and fixes misrendering
and an assertion failure in Cogs.
Fixes freedesktop.org bug 39888.
Reviewed-by: Brian Paul <brianp@vmware.com>