It is still a work in progress at this point, but it produces working and
reasonably well-optimized code.
Originally based on ir_to_mesa and st_mesa_to_tgsi, but does not directly use
Mesa IR instructions in TGSI generation, instead generating TGSI from the
intermediate class glsl_to_tgsi_instruction. It also has new optimization
passes to replace _mesa_optimize_program.
The previous formula for atan(x,y) returned a value of +/- pi whenever
|x|<0.0001, and used a formula based on atan(y/x) otherwise. This
broke in cases where both x and y were small (e.g. atan(1e-5, 1e-5)).
This patch modifies the formula so that it returns a value of +/- pi
whenever |x|<1e-8*|y|, and uses the formula based on atan(y/x)
otherwise.
The previous formula for asin(x) was algebraically equivalent to:
sign(x)*(pi/2 - sqrt(1-|x|)*(A + B|x| + C|x|^2))
where A, B, and C were arbitrary constants determined by a curve fit.
This formula had a worst case absolute error of 0.00448, an unbounded
worst case relative error, and a discontinuity near x=0.
Changed the formula to:
sign(x)*(pi/2 - sqrt(1-|x|)*(pi/2 + (pi/4-1)|x| + A|x|^2 + B|x|^3))
where A and B are arbitrary constants determined by a curve fit. This
has a worst case absolute error of 0.00039, a worst case relative
error of 0.000405, and no discontinuities.
I don't expect a significant performance degradation, since the extra
multiply-accumulate should be fast compared to the sqrt() computation.
Fixes piglit tests {vs,fs}-asin-float and {vs,fs}-atan-*
The function used a variable named 'score', which was an outright lie.
A signature matches or it doesn't; there is no fuzzy scoring.
Change the return type of parameter_lists_match() to an enum, and
let ir_function::matching_sigature() switch on that enum.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
Array constructors obey narrower conversion rules than other constructors
[1] --- they use the implicit conversion rules [2] instead of the scalar
constructor conversions [3]. But process_array_constructor() was
incorrectly applying the broader rules.
[1] GLSL 1.50 spec, Section 5.4.4 Array Constructors, page 52 (58 of pdf)
[2] GLSL 1.50 spec, Section 4.1.10 Implicit Conversions, page 25 (31 of pdf)
[3] GLSL 1.50 spec, Section 5.4.1 Conversion, page 48 (54 of pdf)
To fix this, first check (with glsl_type::can_be_implicitly_converted_to)
if an implicit conversion is legal before performing the conversion.
Fixes:
piglit:spec/glsl-1.20/compiler/structure-and-array-operations/array-ctor-implicit-conversion-bool-float.vert
piglit:spec/glsl-1.20/compiler/structure-and-array-operations/array-ctor-implicit-conversion-bvec*-vec*.vert
Note: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
The function is no longer used and has been replaced by
glsl_type::can_implicitly_convert_to().
Note: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
Context
-------
In ast_function_expression::hir(), parameter_lists_match() checks if the
function call's actual parameter list matches the signature's parameter
list, where the match may require implicit conversion of some arguments.
To check if an implicit conversion exists between individual arguments,
type_compare() is used.
Problems
--------
type_compare() allowed the following illegal implicit conversions:
bool -> float
bvecN -> vecN
int -> uint
ivecN -> uvecN
uint -> int
uvecN -> ivecN
Change
------
type_compare() is buggy, so replace it with glsl_type::can_be_implicitly_converted_to().
This comprises a rewrite of parameter_lists_match().
Fixes piglit:spec/glsl-1.20/compiler/built-in-functions/outerProduct-bvec*.vert
Note: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
This method checks if a source type is identical to or can be implicitly
converted to a target type according to the GLSL 1.20 spec, Section 4.1.10
Implicit Conversions.
The following commits use the method for a bugfix:
glsl: Fix implicit conversions in non-constructor function calls
glsl: Fix implicit conversions in array constructors
Note: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
This is part of fixing a ~1% performance regression in OpenArena when
changing the fixed function fragment shader to using the new backend.
Right now this just avoids the LINTERP of the projector, not the math
using it.
Certain attributes (position, psize, etc.) don't
count as params; they are handled separately by the hw.
However, the VS is required to export at least one param
and r600_shader_from_tgsi() takes care of adding a dummy
export if there is none. Make sure the VS param export
count in the SPI properly accounts for this.
Note: This is a candidate for the 7.11 branch.
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
This was done in the old codegen path, but not the new one. Caught by
piglit fbo tests after the conversion to GLSL ff_fragment_shader.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The FF VS generation happens just after the FF FS generation in
state.c, so the ctx->VP._Current value is for the previous state
update's vertex shader, not the one that will be chosen as a result of
this state update. The vertexShader and vertexProgram variables
should be accurately telling us whether there's going to be a
ctx->VP._Current (except on _MaintainTnlProgram drivers, where it's
always true).
The glsl-vs-statechange-1 test was created to test for this, but it
turns out that the bug is hidden by the fact that we call
_mesa_update_state() twice per draw call -- once from
_mesa_valid_to_render() and once from vbo_draw_arrays(), and the
second one was fixing up the first one.
Reviewed-by: Brian Paul <brianp@vmware.com>
We have to make it through this loop processing the color multiple
times, so we can't go overwriting it on our first color buffer.
Reviewed-by: Brian Paul <brianp@vmware.com>
When we do a glReadPixels into the temporary buffer, we don't want to
use GL_LUMINANCE, GL_LUMINANCE_ALPHA or GL_INTENSITY since they will
compute L=R+G+B which is not what we want.
This bug has existed all along but was only exposed by the elimination
of the driver hook for glCopyTexImage() in
5874890c26.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=39604
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
The previous commit removed the last use of this field.
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
The purpose of the (irb->draw_offset & 4095) != 0 check was to ensure
that we don't have XYy offsets into a tile, since Gen4 hardware doesn't
support that. However, it's insufficient: there are cases where
draw_offset & 4095 is 0 but we still have a Y-offset. This leads to an
assertion failure in brw_update_renderbuffer_surface with tile_y != 0.
Instead, simply call intel_renderbuffer_tile_offsets to compute the
actual X/Y offsets and check if either are non-zero. This makes both
the workaround and the assertion check the same things.
Fixes piglit test fbo-generatemipmap-formats, and should also fix
bugs #34009 and #39487.
NOTE: This is a candidate for stable release branches.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34009
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39487
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad@chad-versace.us>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
We were neglecting to load dvdx and dvdy. v is not optional.
Fixes glslparsertests tex-grad-0[12345].frag on Broadwater/Crestline.
(We still need an execution test using sampler1D.)
NOTE: This is a candidate for the 7.11 branch.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
The constant used in the radians() function didn't have enough
precision, causing a relative error of 1.676e-5, which is far worse
than the precision of 32-bit floats. This patch reduces the relative
error to 1.14e-9, which is the best we can do in 32 bits.
Fixes piglit tests {fs,vs}-radians-{float,vec2,vec3,vec4}.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
What a beast.
r300g doesn't depend on files from r300c anymore, so r300c is now left
to its own fate. BTW 'make test' can be invoked from the gallium/r300
directory to run some compiler unit tests.
The implementation deviated slightly from the GL_EXT_texture_sRGB spec
and from other implementations. A giant comment block was added to
justify the somewhat odd behavior of this function.
In addition, the interface had unnecessary cruft. The 'all' parameter
was false at all callers, so it has been removed.
Reviewed-by: Brian Paul <brianp@vmware.com>
If an application requests a generic compressed format for a texture
and the driver does not pick a specific compressed format, return the
generic base format (e.g., GL_RGBA) for the GL_TEXTURE_INTERNAL_FORMAT
query.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=3165
Reviewed-by: Brian Paul <brianp@vmware.com>
lower_variable_index_to_cond_assign runs until it can't make any more
progress. It then returns the result of the last pass which will
always be false. This caused the lowering loop in
_mesa_ir_link_shader to end before doing one last round of
lower_if_to_cond_assign. This caused several if-statements (resulting
from lower_variable_index_to_cond_assign) to be left in the IR.
In addition to this change, lower_variable_index_to_cond_assign should
take a flag indicating whether or not it should even generate
if-statements. This is easily controlled by
switch_generator::linear_sequence_max_length. This would generate
much better code on architectures without any flow contol.
Fixes i915 piglit regressions glsl-texcoord-array and
glsl-fs-vec4-indexing-temp-src.
Reviewed-by: Eric Anholt <eric@anholt.net>
The index buffer state emit only occurred if there was an IB in place
and we were in either a new batch or a new IB state. But because we
only flagged new IB state if IB state changed from the last IB state
we calculated, we could simply never emit IB state after batchbuffer
wraps if the first draw didn't use the IB and we didn't actually
change the IB.
Fixes piglit glx-multi-context-ib-1.
Fixes user-clip on 965 with 3D clears enabled. I created a separate
flag because I wanted to avoid the overhead of the matrix operations
in this path.
Reviewed-by: Brian Paul <brianp@vmware.com>
It turns out that internally the texture cache gets flushed in a
couple of cases, particularly around 2D operations mixed with 3D. In
almost all cases one of those happens between rendering to an
FBO-attached texture and rendering from that texture. However, as of
the next patch, glean tfbo (and the new fbo-flushing-2 test) would
manage to get stale texture values because one of those flushes didn't
occur. The intention of this code was always to get the render cache
cleared and ready to be used from the sampler cache (and it does on <=
gen4), so this just catches gen5 up.
This patch was also tested to fix fbo-flushing on gen7.
When emitting a MAC instruction in a vertex shader, brw_vs_emit()
calls accumulator_contains() to determine whether the accumulator
already contains the appropriate addend; if it does, then we can avoid
emitting an unnecessary MOV instruction.
However, accumulator_contains() wasn't checking the val.negate or
val.abs flags. As a result, if the desired value was the negation, or
the absolute value, of what was already in the accumulator, we would
generate an incorrect shader.
Fixes piglit test vs-refract-vec4-vec4-float.
Tested on Gen5 and Gen6.
Reviewed-by: Eric Anholt <eric@anholt.net>
On Ivybridge, the shadow comparitor goes in the first slot, rather than
at the end. It's not necessary to send u, v, and r.
Fixes tests texturing/texdepth and glean/fbo.
NOTE: This is a candidate for the 7.11 branch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Commit 53c89c67f3 ("i965: Avoid generating
MOVs for assignments of expressions.") added the line "this->result =
reg_undef" all over the code. Unfortunately, since Eric developed his
patch before I landed Ivybridge support, he missed adding it to
fs_visitor::emit_texture_gen7() after rebasing.
Furthermore, since I developed TXD support before Eric's patch, I
neglected to add it to the gradient handling when I rebased.
Neglecting to set this causes the visitor to use this->result as storage
rather than generating a new temporary. These missing statements
resulted in the same register being used to store several different
values.
Fixes the following piglit tests on Ivybridge:
- glsl-fs-shadow2dproj.shader_test
- glsl-fs-shadow2dproj-bias.shader_test
NOTE: This is a candidate for the 7.11 branch.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
The blend_quad function clobbers the actual render target color/alpha
values while applying the destination blend factor, which results in
restoring the wrong value during the masking stage for write-disabled
channels.
Reviewed-by: Brian Paul <brianp@vmware.com>
Just like the non-constant array index lowering pass, compare all N
indices at once. For accesses to a vec4, this saves 3 comparison
instructions on a vector architecture.
Reviewed-by: Eric Anholt <eric@anholt.net>
Previously the code would just look at deref->array->type to see if it
was a constant. This isn't good enough because deref->array might be
another ir_dereference_array... of a constant. As a result,
deref->array->type wouldn't be a constant, but
deref->variable_referenced() would return NULL. The unchecked NULL
pointer would shortly lead to a segfault.
Instead just look at the return of deref->variable_referenced(). If
it's NULL, assume that either a constant or some other form of
anonymous temporary storage is being dereferenced.
This is a bit hinkey because most drivers treat constant arrays as
uniforms, but the lowering pass treats them as temporaries. This
keeps the behavior of the old code, so this change isn't making things
worse.
Fixes i965 piglit:
vs-temp-array-mat[234]-index-col-rd
vs-temp-array-mat[234]-index-col-row-rd
vs-uniform-array-mat[234]-index-col-rd
vs-uniform-array-mat[234]-index-col-row-rd
Reviewed-by: Eric Anholt <eric@anholt.net>
Leaving the unused registers with other values caused assertion
failures and other problems in places that blindly iterate over all
sources.
brw_vs_emit.c:1381: get_src_reg: Assertion `c->regs[file][index].nr !=
0' failed.
Fixes i965 piglit:
vs-uniform-array-mat[234]-col-row-rd
vs-uniform-array-mat[234]-index-col-row-rd
vs-uniform-array-mat[234]-index-row-rd
vs-uniform-mat[234]-col-row-rd
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
This fixes many cases of accessing arrays of matrices using
non-constant indices at each level.
Fixes i965 piglit:
vs-temp-array-mat[234]-index-col-rd
vs-temp-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-wr
vs-uniform-array-mat[234]-index-col-rd
Fixes swrast piglit:
fs-temp-array-mat[234]-index-col-rd
fs-temp-array-mat[234]-index-col-row-rd
fs-temp-array-mat[234]-index-col-wr
fs-uniform-array-mat[234]-index-col-rd
fs-uniform-array-mat[234]-index-col-row-rd
fs-varying-array-mat[234]-index-col-rd
fs-varying-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-rd
vs-temp-array-mat[234]-index-col-row-rd
vs-temp-array-mat[234]-index-col-wr
vs-uniform-array-mat[234]-index-col-rd
vs-uniform-array-mat[234]-index-col-row-rd
vs-varying-array-mat[234]-index-col-rd
vs-varying-array-mat[234]-index-col-row-rd
vs-varying-array-mat[234]-index-col-wr
Reviewed-by: Eric Anholt <eric@anholt.net>
If the non-constant index was in the LHS of an assignment, any
existing condititon on that assignment would be lost.
Reviewed-by: Eric Anholt <eric@anholt.net>
If the non-constant index was in the LHS of an assignment, any
existing condititon on that assignment would be lost.
Fixes i965 piglit:
fs-temp-array-mat[234]-col-row-wr
fs-temp-array-mat[234]-index-col-row-wr
fs-temp-array-mat[234]-index-col-wr
fs-temp-array-mat[234]-index-row-wr
vs-varying-array-mat[234]-index-col-wr
Reviewed-by: Eric Anholt <eric@anholt.net>
The previous implementation could easily get tricked if the LHS of an
assignment included a non-constant index that was "inside" another
dereference. For example:
mat4 m[2];
m[0][i] = vec4(0.0);
Due to the way it tracked whether the array was being assigned, it
would think that the non-constant index was in an r-value. The new
code fixes that by tracking l-values and r-values differently. The
index is also replaced by cloning the IR and replacing the index
variable instead of the odd way it was done before.
v2: Apply some simplifications suggested by Eric Anholt. Making
assignment_generator::rvalue be ir_dereference instead of ir_rvalue
simplified the code a bit.
Fixes i965 piglit fs-temp-array-mat[234]-index-wr and
vs-varying-array-mat[234]-index-wr.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34691
Reviewed-by: Eric Anholt <eric@anholt.net>
There's no reason for it to be there, and another class that may not
have access to the visitor will need it soon.
Reviewed-by: Eric Anholt <eric@anholt.net>
Not sure how I computed these, but they were wrong (which explains why
bumping the polynomial order before never improved precision).
This allows to pass the EXP test cases of PSPrecision/VSPrecision DCTs.
Add an iteration step, which makes rqsqrt precision go from 12bits to
24, and fixes RSQ/NRM test case of PSPrecision/VSPrevision DCTs.
There are no uses of this function outside shader translation.
These tests invoke do_lower_jumps() in isolation (using the glsl_test
executable) and verify that it transforms the IR in the expected way.
The unit tests may be run from the top level directory using "make
check".
For reference, I've also checked in the Python script
create_test_cases.py, which was used to generate these tests. It is
not necessary to run this script in order to run the tests.
Acked-by: Chad Versace <chad@chad-versace.us>
This patch adds a new build artifact, glsl_test, which can be used for
testing optimization passes in isolation.
I'm hoping that we will be able to add other useful standalone tests
to this executable in the future. Accordingly, it is built in a
modular fashion: the main() function uses its first argument to
determine which test function to invoke, removes that argument from
argv[], and then calls that function to interpret the rest of the
command line arguments and perform the test. Currently the only test
function is "optpass", which tests optimization passes.
This patch moves the following functions from main.cpp (the main cpp
file for the standalone executable that is used to create the built-in
functions) to standalone_scaffolding.cpp, so that they can be re-used
in other standalone executables:
- initialize_context()*
- _mesa_new_shader()
- _mesa_reference_shader()
*initialize_context contained some code that was specific to main.cpp,
so it was split into two functions: initialize_context() (which
remains in main.cpp), and initialize_context_from_defaults() (which is
in standalone_scaffolding.cpp).
Several Mesa headers redundantly define the INLINE macro. Adding this
guard prevents the compiler from complaining about macro redefinition.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad@chad-versace.us>
This is an alternative to the draw module's polygon stipple stage.
The softpipe implementation here is just a test. The advantange of
using the new polygon stipple utility module (with other drivers)
is we can avoid software vertex processing in the draw module and
get much better performance.
Polygon stipple doesn't require special vertex processing like
the other draw module stage.
u_vbuf_upload_buffers modifies the buffer offsets. If they are not
restored, and any of the vertex formats is not supported natively, the
next u_vbuf_mgr_draw_begin call will translate the vertex buffers with
incorrect buffer offsets.
ES 2.0.25 page 127 says:
If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then
querying any other pname will generate INVALID_ENUM.
See also:
b9e9df78a0
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The GLSL 1.20 and later specs say:
"Recursion is not allowed, not even statically. Static recursion is
present if the static function call graph of the program contains
cycles."
Recursion is detected and rejected both a compile-time and at
link-time. The complie-time check happens to detect some cases that
may be removed by various optimization passes. The spec doesn't seem
to allow this, but other vendors (e.g., NVIDIA) appear to only check
at link-time after all optimizations.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33885
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Also clarify the documentation for one of the parameters.
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
The behavior of flushes in the hardware is a maze of twisty passages,
and strangely the VS constants appear to be loaded during a pipeline
flush instead of at the time of the packet emit according to the
simulator. On moving the STATE_BASE_ADDRESS packet to where it really
needed to live (in order for data loads by other packets to be
correct), we sometimes no longer got a flush between those packets
where we apparently needed it. This replicates the flushes implied by
a STATE_BASE_ADDRESS update, fixing the GPU hangs in OGLC and the
"engine" demo.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36821
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39257
Tested-by: Keith Packard <keithp@keithp.com> (bzflag and etracer fixed)
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
There's scary stuff going on in PIPE_CONTROL internals, and if the
BSpec says to do this to make PIPE_CONTROL work, I'll go ahead and do
it because we'll probably never be able to debug it after the fact.
v2: Use stall at scoreboard instead of depth stall, as noted by Ken.
For this and occlusion queries, we're trying to avoid setting
I915_GEM_DOMAIN_RENDER for the write domain, because the data written
is definitely not going through the render cache, but we do need to
tell the kernel that the object has been written. However, with using
I915_GEM_DOMAIN_GTT, the kernel on retiring the batchbuffer sees that
the w/a BO has a write domain of GTT, and puts it on the flushing
list. If something tries to wait for that BO to finish rendering
(such as the AUB dumper reading the contents of BOs), we get into
wait_request (since obj->active) but with a 0 seqno (since the object
is on the flushing list, not actually on a ringbuffer), and BUG_ONs.
To avoid the kernel bug (which I'm hoping to delete soon anyway), just
use I915_GEM_DOMAIN_INSTRUCTION like occlusion queries do. This
doesn't result in more flushing, because we invalidate INSTRUCTION on
every batchbuffer now that we're state streaming, anyway.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
This cuts out a large portion of the overhead of glClear() from
resetting the texenv state and recomputing the fixed function
programs. It also means less use of fixed function internally in our
GLES2 drivers, which is rather bogus.
Reviewed-by: Brian Paul <brianp@vmware.com>
When parsing S-Expressions, we need to store nul-terminated strings for
Symbol nodes. Prior to this patch, we called ralloc_strndup each time
we constructed a new s_symbol. It turns out that this is obscenely
expensive.
Instead, copy the whole buffer before parsing and overwrite it to
contain \0 bytes at the appropriate locations. Since atoms are
separated by whitespace, (), or ;, we can safely overwrite the character
after a Symbol. While much of the buffer may be unused, copying the
whole buffer is simple and guaranteed to provide enough space.
Prior to this, running piglit-run.py -t glsl tests/quick.tests with GLSL
1.30 enabled took just over 10 minutes on my machine. Now it takes 5.
NOTE: This is a candidate for stable release branches (because it will
make running comparison tests so much less irritating.)
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Commit 1a339b6c71 made
st_ChooseTextureFormat map GL_RGBA with type GL_UNSIGNED_BYTE
to PIPE_FORMAT_A8B8G8R8_UNORM.
The image format for ARGB pixmaps is PIPE_FORMAT_B8G8R8A8_UNORM
however. This mismatch caused the texture to be recreated in
st_finalize_texture.
NOTE: This is a candidate for the 7.11 branch.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39209
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Brian Paul <brianp@vmware.com>
This fixes a regression introduced by commit
a26121f375 (fd.o bug #39219).
Since the __glXInitialize() call should be unnecessary anyway, this is
probably a nicer fix for the original problem too.
NOTE: This is a candidate for the 7.10 and 7.11 branches.
Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: padfoot@exemail.com.au
Until now, the stencil buffer was allocated as a Y tiled buffer, because
in several locations the PRM states that it is. However, it is actually
W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section
4.5.2.1 W-Major Format:
W-Major Tile Format is used for separate stencil.
The GTT is incapable of W fencing, so we allocate the stencil buffer with
I915_TILING_NONE and decode the tile's layout in software.
This fix touches the following portions of code:
- In intel_allocate_renderbuffer_storage(), allocate the stencil
buffer with I915_TILING_NONE.
- In intel_verify_dri2_has_hiz(), verify that the stencil buffer is
not tiled.
- In the stencil buffer's span functions, the tile's layout must be
decoded in software.
This commit mutually depends on the xf86-video-intel commit
dri: Do not tile stencil buffer
Author: Chad Versace <chad@chad-versace.us>
Date: Mon Jul 18 00:38:00 2011 -0700
On Gen6 with separate stencil enabled, fixes the following Piglit tests:
bugs/fdo23670-drawpix_stencil
general/stencil-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels
spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels
spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels
spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels
spec/EXT_packed_depth_stencil/readpixels-24_8
Note: This is a candidate for the 7.11 branch.
Signed-off-by: Chad Versace <chad@chad-versace.us>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
LLVM 3.0svn introduced a new type system. It defines a new way to create
named structs and removes the (now not needed) LLVMInvalidateStructLayout
function. See revision 134829 of LLVM.
Signed-off-by: Tobias Droste <tdroste@gmx.de>
Signed-off-by: Brian Paul <brianp@vmware.com>
Including the full "3DSTATE_VF_STATISTICS" should make it easier to
cross-reference the code and documentation.
Also, move the 965/GM45 suffix to the beginning for consistency with
newer #defines.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
The documentation uses 3DSTATE_DRAWING_RECTANGLE, and we already had it
defined in brw_defines.h; we were simply using an old #define from
intel_reg.h.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
This is useful for shadow map generation. Tested with glsl-bug-22603,
which rendered the depth textures with fallbacks before.
Acked-by: Chad Versace <chad@chad-versace.us>
We were updating our new viewport using the old buffers' _WindowMap.m.
We can do less math and avoid using that deprecated matrix by just
folding the viewport calculation right in to the driver.
Fixes piglit fbo-depthtex.
i915_update_draw_buffers() already handles the fallback bit for
missing stencil region, so here we just need to handle whether the GL
thinks we have stencil data or not (and disable the test if so).
We were disabling it once at the moment we changed draw buffers, but
later enabling of depth test could turn it back on. Fixes
fbo-nodepth-test.
Note that ctx->DrawBuffer has to be checked because during context
create we get called while it's still unset. However, we know we'll
get an intel_draw_buffer() after that, so it's safe to make a silly
choice at this point.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30080
The illusion of shared code here wasn't fooling anybody. It was
tempting to keep i830 and i915 still shared, but I think I actually
want to make them diverge shortly.
Reviewed-by: Chad Versace <chad@chad-versace.us>
This brings us into compliance with page 17 (page 22 of the PDF) of
the GLSL 1.20 spec:
"[Sampler types] can only be declared as function parameters or
uniform variables (see Section 4.3.5 "Uniform"). ... [Samplers]
cannot be used as out or inout function parameters."
The spec isn't explicit about whether this rule applies to
structs/arrays containing shaders, but the intent seems to be to
ensure that it can always be determined at compile time which sampler
is being used in each texture lookup. So to avoid creating a
loophole, the rule needs to apply to structs/arrays containing shaders
as well.
Fixes piglit tests spec/glsl-1.10/compiler/samplers/*.frag, and fixes
bug 38987.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38987
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
The new location, as a member function of glsl_type, is more
consistent with queries like is_sampler(), is_boolean(), is_float(),
etc. Placing the function inside glsl_type also makes it available to
any code that uses glsl_types.
These happen to work because their values are the same as the equivalent
PIPE_TRANSFER_* flags, but it's still misleading.
Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
The GLSL spec says:
"If a built-in function is redeclared in a shader (i.e., a
prototype is visible) before a call to it, then the linker will
only attempt to resolve that call within the set of shaders that
are linked with it."
This patch enforces this behavior. When a function call is processed
a flag is set in the ir_call to indicate whether the previously seen
prototype is the built-in or not. At link time a call will only bind
to an instance of a function that matches the "want built-in" setting
in the ir_call.
This has the odd side effect that first call to abs() in the shader
below will call the built-in and the second will not:
float foo(float x) { return abs(x); }
float abs(float x) { return -x; }
float bar(float x) { return abs(x); }
This seems insane, but it matches what the spec says.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31744
The following resolves the build issues and missing symbols
Add "xvmc-nouveau/target.c" - missing symbol "driver_description"
Add "drivers/nvc0/libnvc0.a" - missing symbol "nvc0_screen_create"
Remove "drivers/softpipe/libsoftpipe.a" - unnessecary dependency
resolves build (when building without swrast)
Add "drivers/trace/libtrace.a" in Makefile
Note: With/without those patches xvmc-nouveau still segfaults
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Use all zpass data for predication instead of the last block only.
Use query buffer as a ring instead of reusing the same area
for each new BeginQuery. All query buffer offsets are in bytes
to simplify offsets math.
This patch add the support for 24bpp in the dri/swrast implementation.
See http://bugs.freedesktop.org/show_bug.cgi?id=23525
Signed-off-by: Marc Pignat <marc at pignat.org>
Signed-off-by: Brian Paul <brianp@vmware.com>
commit 1856230d9fa61710cce3e152b8d88b1269611a73
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date: Tue Jul 12 23:41:27 2011 +0100
make: Use better var names on packaging.
commit d1ae72d0bd14e820ecfe9f8f27b316f9566ceb0c
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date: Tue Jul 12 23:38:21 2011 +0100
make: Apply several of Dan Nicholson's suggestions.
commit f27cf8743ac9cbf4c0ad66aff0cd3f97efde97e4
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date: Sat Jul 9 14:18:20 2011 +0100
make: Put back the tar.bz2 creation rule.
Removed by accident.
commit 34983337f9d7db984e9f0117808274106d262110
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date: Sat Jul 9 11:59:29 2011 +0100
make: Determine tarballs contents via git ls-files.
The wildcards were a mess:
- lots of files for non Linux platforms missing
- several files listed and archived twice
Using git-ls-files ensures things are not loss when making the tarballs.
commit 34a28ccbf459ed5710aafba5e7149e8291cb808c
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date: Sat Jul 9 11:07:14 2011 +0100
glut: Remove GLUT source.
Most distros ship freeglut, and most people don't care one vs the other,
and it hasn't been really maintained.
So it is better to have Mesa GLUT be revisioned and built separately
from Mesa.
commit 5c26a2c3c0c7e95ef853e19d12d75c4f80137e7d
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date: Sat Jul 9 10:31:02 2011 +0100
Ignore the tarballs.
commit 26edecac589819f0d0efe2165ab748dbc4e53394
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date: Sat Jul 9 10:30:24 2011 +0100
make: Create the Mesa-xxx-devel symlink automatically.
Also actually remote the intermediate uncompressed tarballs.
this moves getting the context into the debug in this function,
just spotted it trawling callgrind traces for other things.
Signed-off-by: Dave Airlie <airlied@redhat.com>
The forward references to video enum types in p_context.h causes
a massive number of compiler warnings (ISO C forbids forward references
to ‘enum’ types).
By putting the new video enums in a separate header that can be included
by p_context.h and p_screen.h we can avoid this.
Acked-by Christian König <deathsimple@vodafone.de>
inline the hotpath of the reference remaining the same. This shouldn't
penalise the slow path at all but improve the hot path so we don't have
to jump to the function.
It also moves some assert checks under an #ifndef NDEBUG.
Minor clean-ups added by Brian.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
Because we don't support them.
For instance, R32G32B32 is not R32G32B32X32 as was assumed.
Add support for R8G8B8X8_UNORM instead of R8G8B8_UNORM surfaces.