Commit Graph

40229 Commits

Author SHA1 Message Date
Bryan Cain f379d8f730 st/mesa: Add a GLSL IR to TGSI translator.
It is still a work in progress at this point, but it produces working and
reasonably well-optimized code.

Originally based on ir_to_mesa and st_mesa_to_tgsi, but does not directly use
Mesa IR instructions in TGSI generation, instead generating TGSI from the
intermediate class glsl_to_tgsi_instruction.  It also has new optimization
passes to replace _mesa_optimize_program.
2011-08-01 17:59:07 -05:00
Paul Berry b1b4ea0b36 glsl: improve the accuracy of the atan(x,y) builtin function.
The previous formula for atan(x,y) returned a value of +/- pi whenever
|x|<0.0001, and used a formula based on atan(y/x) otherwise.  This
broke in cases where both x and y were small (e.g. atan(1e-5, 1e-5)).

This patch modifies the formula so that it returns a value of +/- pi
whenever |x|<1e-8*|y|, and uses the formula based on atan(y/x)
otherwise.
2011-08-01 14:37:38 -07:00
Paul Berry d4c80f5f85 glsl: improve the accuracy of the asin() builtin function.
The previous formula for asin(x) was algebraically equivalent to:

sign(x)*(pi/2 - sqrt(1-|x|)*(A + B|x| + C|x|^2))

where A, B, and C were arbitrary constants determined by a curve fit.

This formula had a worst case absolute error of 0.00448, an unbounded
worst case relative error, and a discontinuity near x=0.

Changed the formula to:

sign(x)*(pi/2 - sqrt(1-|x|)*(pi/2 + (pi/4-1)|x| + A|x|^2 + B|x|^3))

where A and B are arbitrary constants determined by a curve fit.  This
has a worst case absolute error of 0.00039, a worst case relative
error of 0.000405, and no discontinuities.

I don't expect a significant performance degradation, since the extra
multiply-accumulate should be fast compared to the sqrt() computation.

Fixes piglit tests {vs,fs}-asin-float and {vs,fs}-atan-*
2011-08-01 14:37:38 -07:00
Chad Versace 5541920e0a glsl: Remove duplicate comment
Remove duplicate doxgen comment for
ir_function.cpp:parameter_lists_match().

Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-08-01 09:37:06 -07:00
Jeremy Huddleston 5b3c719983 darwin: Use machine/endian.h to determine endianness
Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2011-07-31 09:43:52 -07:00
Jeremy Huddleston e737a99a6f Fix PPC detection on darwin
Fixes regression introduced by 7004582c18

Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2011-07-31 09:24:35 -07:00
Chad Versace 5081d31a0e glsl: Clarify ir_function::matching_sigature()
The function used a variable named 'score', which was an outright lie.
A signature matches or it doesn't; there is no fuzzy scoring.

Change the return type of parameter_lists_match() to an enum, and
let ir_function::matching_sigature() switch on that enum.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-07-30 07:27:38 -07:00
Chad Versace a5ab9398e3 glsl: Fix conversions in array constructors
Array constructors obey narrower conversion rules than other constructors
[1] --- they use the implicit conversion rules [2] instead of the scalar
constructor conversions [3].  But process_array_constructor() was
incorrectly applying the broader rules.

[1] GLSL 1.50 spec, Section 5.4.4 Array Constructors, page 52 (58 of pdf)
[2] GLSL 1.50 spec, Section 4.1.10 Implicit Conversions, page 25 (31 of pdf)
[3] GLSL 1.50 spec, Section 5.4.1 Conversion, page 48 (54 of pdf)

To fix this, first check (with glsl_type::can_be_implicitly_converted_to)
if an implicit conversion is legal before performing the conversion.

Fixes:
piglit:spec/glsl-1.20/compiler/structure-and-array-operations/array-ctor-implicit-conversion-bool-float.vert
piglit:spec/glsl-1.20/compiler/structure-and-array-operations/array-ctor-implicit-conversion-bvec*-vec*.vert

Note: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-07-30 07:27:30 -07:00
Chad Versace 6efe1a8495 glsl: Remove ir_function.cpp:type_compare()
The function is no longer used and has been replaced by
glsl_type::can_implicitly_convert_to().

Note: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-07-30 07:27:25 -07:00
Chad Versace 8b3627fd7b glsl: Fix implicit conversions in non-constructor function calls
Context
-------
In ast_function_expression::hir(), parameter_lists_match() checks if the
function call's actual parameter list matches the signature's parameter
list, where the match may require implicit conversion of some arguments.
To check if an implicit conversion exists between individual arguments,
type_compare() is used.

Problems
--------
type_compare() allowed the following illegal implicit conversions:
    bool -> float
    bvecN -> vecN

    int -> uint
    ivecN -> uvecN

    uint -> int
    uvecN -> ivecN

Change
------
type_compare() is buggy, so replace it with glsl_type::can_be_implicitly_converted_to().
This comprises a rewrite of parameter_lists_match().

Fixes piglit:spec/glsl-1.20/compiler/built-in-functions/outerProduct-bvec*.vert

Note: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-07-30 07:27:14 -07:00
Chad Versace 200e4972c1 glsl: Add method glsl_type::can_implicitly_convert_to()
This method checks if a source type is identical to or can be implicitly
converted to a target type according to the GLSL 1.20 spec, Section 4.1.10
Implicit Conversions.

The following commits use the method for a bugfix:
    glsl: Fix implicit conversions in non-constructor function calls
    glsl: Fix implicit conversions in array constructors

Note: This is a candidate for the 7.10 and 7.11 branches.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-07-30 07:26:59 -07:00
Brian Paul 120d71a45c mesa: minor comment changes in teximage.c 2011-07-29 16:49:55 -06:00
Brian Paul dc1f32deae mesa: add missing breaks for GL_TEXTURE_CUBE_MAP_SEAMLESS queries
And fix indentation.

NOTE: This is a candidate for the 7.11 branch.
2011-07-29 16:49:55 -06:00
Eric Anholt f710b8c750 i965/fs: Allow register coalescing where the source is a uniform.
Removes 0.8% of the fragment shader instructions on Unigine Tropics.
2011-07-29 12:17:03 -07:00
Eric Anholt a8b86459a1 i965/fs: Optimize a * 1.0 -> a.
This appears in our instruction stream as a result of the
brw_vs_constval.c handling.
2011-07-29 12:17:03 -07:00
Eric Anholt 6d8d6b41b8 i965/fs: If we see a RCP of a constant, try to constant fold it. 2011-07-29 12:17:03 -07:00
Eric Anholt eb30820f26 i965/fs: Port texture projection avoidance optimization from the old backend.
This is part of fixing a ~1% performance regression in OpenArena when
changing the fixed function fragment shader to using the new backend.
Right now this just avoids the LINTERP of the projector, not the math
using it.
2011-07-29 12:17:03 -07:00
Eric Anholt 652ef8569c Revert "i965: Don't compute brw->wm.input_size_masks when it's unused."
This reverts commit 3412069e23.  We're
about to start using it in fragment shaders to handle avoiding
projection for fixed function.
2011-07-29 12:17:03 -07:00
Eric Anholt 44ffb4ae20 i965/fs: Stop using the exec_list iterator.
The old style has gone out of favor in the project, but I kept copy
and pasting from existing iterator code.
2011-07-29 12:17:03 -07:00
Alex Deucher dc1c0ca22a r600g: fix up vs export handling
Certain attributes (position, psize, etc.) don't
count as params; they are handled separately by the hw.
However, the VS is required to export at least one param
and r600_shader_from_tgsi() takes care of adding a dummy
export if there is none.  Make sure the VS param export
count in the SPI properly accounts for this.

Note: This is a candidate for the 7.11 branch.

Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2011-07-29 11:34:02 -04:00
Chia-I Wu 5c9e0ad5fd st/egl: create pbuffers with PIPE_BIND_SAMPLER_VIEW
So that eglBindTexImage works.
2011-07-29 14:16:51 +09:00
Eric Anholt 4fdd289805 i965/fs: Respect ARB_color_buffer_float clamping.
This was done in the old codegen path, but not the new one.  Caught by
piglit fbo tests after the conversion to GLSL ff_fragment_shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-28 20:58:18 -07:00
Eric Anholt ef1854d090 mesa: Fix ff fragment shader inputs calculation when enabling a VS.
The FF VS generation happens just after the FF FS generation in
state.c, so the ctx->VP._Current value is for the previous state
update's vertex shader, not the one that will be chosen as a result of
this state update.  The vertexShader and vertexProgram variables
should be accurately telling us whether there's going to be a
ctx->VP._Current (except on _MaintainTnlProgram drivers, where it's
always true).

The glsl-vs-statechange-1 test was created to test for this, but it
turns out that the bug is hidden by the fact that we call
_mesa_update_state() twice per draw call -- once from
_mesa_valid_to_render() and once from vbo_draw_arrays(), and the
second one was fixing up the first one.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-07-28 20:51:53 -07:00
Chia-I Wu 96ca6a6262 targets/{egl,gbm}: omit unneeded libdrm_radeon 2011-07-29 10:24:45 +09:00
Chia-I Wu d6a9564854 egl: EGL_MATCH_NATIVE_NATIVE_PIXMAP cannot be EGL_DONT_CARE 2011-07-29 10:24:45 +09:00
Chia-I Wu a5ab46909e egl: make pixmaps and pbuffers EGL_BUFFER_PRESERVED
eglSwapBuffers is no-op to these surface types anyway.
2011-07-29 10:24:39 +09:00
Eric Anholt 83f5d5e6aa Add dependency generation for Mesa and GLSL dricore objects.
Reviewed-By: Christopher James Halse Rogers
	     <christopher.halse.rogers@canonical.com>
2011-07-28 17:32:42 -07:00
Eric Anholt f79e3518b4 softpipe: When doing write_all_cbufs, don't stomp over the color.
We have to make it through this loop processing the color multiple
times, so we can't go overwriting it on our first color buffer.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-07-28 17:32:42 -07:00
Brian Paul e4fdc95277 mesa: fix format selection for meta CopyTexSubImage()
When we do a glReadPixels into the temporary buffer, we don't want to
use GL_LUMINANCE, GL_LUMINANCE_ALPHA or GL_INTENSITY since they will
compute L=R+G+B which is not what we want.

This bug has existed all along but was only exposed by the elimination
of the driver hook for glCopyTexImage() in
5874890c26.

Fixes https://bugs.freedesktop.org/show_bug.cgi?id=39604
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2011-07-28 17:29:36 -06:00
Brian Paul 26684e0b1a mesa: test against MESA_FORMAT_NONE in _mesa_GetTexLevelParameteriv() 2011-07-28 17:24:57 -06:00
Brian Paul 58d6aa8287 st/mesa: fix comment language 2011-07-28 17:24:56 -06:00
Vadim Girlin 95ee961f77 r600g: fix vs export count
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=39572

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2011-07-28 18:58:40 -04:00
Kenneth Graunke f73caddd33 i965: Remove the now unused intel_renderbuffer::draw_offset field.
The previous commit removed the last use of this field.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-28 14:04:39 -07:00
Kenneth Graunke 15c0bc5eef i965: Check actual tile offsets in Gen4 miptree workaround.
The purpose of the (irb->draw_offset & 4095) != 0 check was to ensure
that we don't have XYy offsets into a tile, since Gen4 hardware doesn't
support that.  However, it's insufficient: there are cases where
draw_offset & 4095 is 0 but we still have a Y-offset.  This leads to an
assertion failure in brw_update_renderbuffer_surface with tile_y != 0.

Instead, simply call intel_renderbuffer_tile_offsets to compute the
actual X/Y offsets and check if either are non-zero.  This makes both
the workaround and the assertion check the same things.

Fixes piglit test fbo-generatemipmap-formats, and should also fix
bugs #34009 and #39487.

NOTE: This is a candidate for stable release branches.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34009
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39487
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Chad Versace <chad@chad-versace.us>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-28 14:04:39 -07:00
Kenneth Graunke 3e1fd13f60 i965/gen4: Fix message parameter loading for 1D TXD sampling.
We were neglecting to load dvdx and dvdy.  v is not optional.

Fixes glslparsertests tex-grad-0[12345].frag on Broadwater/Crestline.
(We still need an execution test using sampler1D.)

NOTE: This is a candidate for the 7.11 branch.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-28 14:04:39 -07:00
Paul Berry fe33c886a7 glsl: improve the accuracy of the radians() builtin function
The constant used in the radians() function didn't have enough
precision, causing a relative error of 1.676e-5, which is far worse
than the precision of 32-bit floats.  This patch reduces the relative
error to 1.14e-9, which is the best we can do in 32 bits.

Fixes piglit tests {fs,vs}-radians-{float,vec2,vec3,vec4}.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-28 10:41:39 -07:00
Ian Romanick f622c6d7a2 glsl: Add source location tracking to TODO list 2011-07-27 11:41:14 -07:00
Ian Romanick 5e1b7097f3 glsl: Remove completed items from the TODO list 2011-07-27 11:41:14 -07:00
Christoph Bumiller 58c04435b1 mesa: don't forget about sampleBuffers in framebuffer visual update
Otherwise multisample will never been enabled for multisample
renderbuffers.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-07-27 12:13:37 +02:00
Benjamin Franzke 79dcfb266a wayland-drm: Add copyright notice to protocol
Fixes build since wayland 986703ac7365bc87a5501714adb9fc73157c62b7.
2011-07-27 10:07:14 +02:00
Tobias Droste d4d5e3a336 egl/gallium: fix build without softpipe and llvmpipe
Signed-off-by: Tobias Droste <tdroste@gmx.de>
Acked-by: Jakob Bornecrantz <wallbraker@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2011-07-27 09:35:46 +02:00
Marek Olšák 1c2c4ddbd1 r300g: copy the compiler from r300c
What a beast.

r300g doesn't depend on files from r300c anymore, so r300c is now left
to its own fate. BTW 'make test' can be invoked from the gallium/r300
directory to run some compiler unit tests.
2011-07-26 22:35:49 +02:00
Bryan Cain 860c51d827 util: enable S3TC support when the force_s3tc_enable env var is set to "true"
NOTE: This is a candidate for the 7.10 and 7.11 branches.
2011-07-26 12:54:42 -05:00
Bryan Cain 95739f19cc st/mesa: respect force_s3tc_enable environment variable
NOTE: This is a candidate for the 7.10 and 7.11 branches.
2011-07-26 12:54:40 -05:00
Ian Romanick b189d1635d mesa: Make _mesa_get_compressed_formats match the texture compression specs
The implementation deviated slightly from the GL_EXT_texture_sRGB spec
and from other implementations.  A giant comment block was added to
justify the somewhat odd behavior of this function.

In addition, the interface had unnecessary cruft.  The 'all' parameter
was false at all callers, so it has been removed.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-07-25 19:57:24 -07:00
Ian Romanick 143b65f761 mesa: Return the correct internal fmt when a generic compressed fmt was used
If an application requests a generic compressed format for a texture
and the driver does not pick a specific compressed format, return the
generic base format (e.g., GL_RGBA) for the GL_TEXTURE_INTERNAL_FORMAT
query.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=3165
Reviewed-by: Brian Paul <brianp@vmware.com>
2011-07-25 19:57:17 -07:00
Ian Romanick 09916e877f mesa: Add utility function to get base format from a GL compressed format
Reviewed-by: Brian Paul <brianp@vmware.com>
2011-07-25 19:57:14 -07:00
Eric Anholt 3daa2d97eb i965/fs: Fix MRT drawing since the m0->m2 move for shader debug.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-07-25 19:42:18 -07:00
Ian Romanick c1e591eed4 glsl: Correctly return progress from lower_variable_index_to_cond_assign
lower_variable_index_to_cond_assign runs until it can't make any more
progress.  It then returns the result of the last pass which will
always be false.  This caused the lowering loop in
_mesa_ir_link_shader to end before doing one last round of
lower_if_to_cond_assign.  This caused several if-statements (resulting
from lower_variable_index_to_cond_assign) to be left in the IR.

In addition to this change, lower_variable_index_to_cond_assign should
take a flag indicating whether or not it should even generate
if-statements.  This is easily controlled by
switch_generator::linear_sequence_max_length.  This would generate
much better code on architectures without any flow contol.

Fixes i915 piglit regressions glsl-texcoord-array and
glsl-fs-vec4-indexing-temp-src.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-25 18:45:46 -07:00
Tobias Droste 84f8548dfc r300/compiler: simplify code in peephole_add_presub_add
Signed-off-by: Tobias Droste <tdroste@gmx.de>
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2011-07-25 23:47:07 +02:00
Marek Olšák f170555a18 winsys/radeon: fix typos in the driver interface 2011-07-25 23:10:40 +02:00
Marek Olšák 533e228923 winsys/radeon: manage constant buffers by the cache bufmgr too 2011-07-25 23:10:40 +02:00
Marek Olšák 7db148d3a5 winsys/radeon: remove usage parameter from buffer_create 2011-07-25 23:10:40 +02:00
Marek Olšák e22a1005c0 winsys/radeon: fix int->boolean conversion in radeon_bo_is_referenced_by_any_cs 2011-07-25 23:10:40 +02:00
Marek Olšák 67c995e0f1 winsys/radeon: little change in radeon_bo_is_referenced_by_cs 2011-07-25 23:10:40 +02:00
Marek Olšák ce9daf6f0b winsys/radeon: add R300 infix to winsys feature names 2011-07-25 23:10:39 +02:00
Marek Olšák 28a336dc38 winsys/radeon: simplify how value queries work
This drops the get_value query and adds a function query_info, which returns
all the values in one nice structure.
2011-07-25 23:10:39 +02:00
Eric Anholt 818db3848b i965: Fix many of the trivial WebGL demos that broke due to IB optimization.
The index buffer state emit only occurred if there was an IB in place
and we were in either a new batch or a new IB state.  But because we
only flagged new IB state if IB state changed from the last IB state
we calculated, we could simply never emit IB state after batchbuffer
wraps if the first draw didn't use the IB and we didn't actually
change the IB.

Fixes piglit glx-multi-context-ib-1.
2011-07-25 13:47:18 -07:00
Eric Anholt a0e5affb22 i965: Use 3D clears on gen6+ to avoid inter-ring synchronization.
Improves firefox-talos-gfx around 5%.
2011-07-25 13:47:18 -07:00
Eric Anholt 8080246892 meta: Also save/restore clip planes for GLSL.
Fixes user-clip on 965 with 3D clears enabled.  I created a separate
flag because I wanted to avoid the overhead of the matrix operations
in this path.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-07-25 13:47:18 -07:00
Eric Anholt 185868c9c2 i965: Emit texture cache flushes on gen6 along with render cache flushes.
It turns out that internally the texture cache gets flushed in a
couple of cases, particularly around 2D operations mixed with 3D.  In
almost all cases one of those happens between rendering to an
FBO-attached texture and rendering from that texture.  However, as of
the next patch, glean tfbo (and the new fbo-flushing-2 test) would
manage to get stale texture values because one of those flushes didn't
occur.  The intention of this code was always to get the render cache
cleared and ready to be used from the sampler cache (and it does on <=
gen4), so this just catches gen5 up.

This patch was also tested to fix fbo-flushing on gen7.
2011-07-25 13:47:01 -07:00
Paul Berry d92463d5dc i965: vs optimization fix: Check val.{negate,abs} in accumulator_contains()
When emitting a MAC instruction in a vertex shader, brw_vs_emit()
calls accumulator_contains() to determine whether the accumulator
already contains the appropriate addend; if it does, then we can avoid
emitting an unnecessary MOV instruction.

However, accumulator_contains() wasn't checking the val.negate or
val.abs flags.  As a result, if the desired value was the negation, or
the absolute value, of what was already in the accumulator, we would
generate an incorrect shader.

Fixes piglit test vs-refract-vec4-vec4-float.

Tested on Gen5 and Gen6.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-25 11:40:53 -07:00
Kenneth Graunke 572f631895 i965/gen7: Fix shadow sampling in the old brw_wm_emit backend.
On Ivybridge, the shadow comparitor goes in the first slot, rather than
at the end.  It's not necessary to send u, v, and r.

Fixes tests texturing/texdepth and glean/fbo.

NOTE: This is a candidate for the 7.11 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-25 10:45:56 -07:00
Kenneth Graunke 156cef0fba i965/fs: Clear result before visiting shadow comparitor and LOD info.
Commit 53c89c67f3 ("i965: Avoid generating
MOVs for assignments of expressions.") added the line "this->result =
reg_undef" all over the code.  Unfortunately, since Eric developed his
patch before I landed Ivybridge support, he missed adding it to
fs_visitor::emit_texture_gen7() after rebasing.

Furthermore, since I developed TXD support before Eric's patch, I
neglected to add it to the gradient handling when I rebased.

Neglecting to set this causes the visitor to use this->result as storage
rather than generating a new temporary.  These missing statements
resulted in the same register being used to store several different
values.

Fixes the following piglit tests on Ivybridge:
- glsl-fs-shadow2dproj.shader_test
- glsl-fs-shadow2dproj-bias.shader_test

NOTE: This is a candidate for the 7.11 branch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-25 10:45:56 -07:00
Emeric 7746b7d4bf vdpau: enable mpeg1 hw decoding, using the exact same code path as mpeg2
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=39307

Signed-off-by: Christian König <deathsimple@vodafone.de>
2011-07-25 19:22:35 +02:00
Christian König 4f90b89961 gallium: change formats merged with pipe-video to type "other"
Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=39276
2011-07-25 18:20:22 +02:00
Christian König 4c84acc86f g3dvl: remove unused vs output from create_ref_vert_shader
The position of the quad vertex is calculated in calc_position,
so we don't need the output here any more.
2011-07-25 01:32:39 +02:00
Christian König 4d23c6df81 r600g: use file_max instead of file_count to determine reg offset
Otherwise shaders with skipped inputs/outputs doesn't work correctly.

Signed-off-by: Christian König <deathsimple@vodafone.de>
2011-07-24 19:17:27 +02:00
Younes Manton ac6455e9a2 gallium/softpipe: Don't clobber dest color/alpha before masking.
The blend_quad function clobbers the actual render target color/alpha
values while applying the destination blend factor, which results in
restoring the wrong value during the masking stage for write-disabled
channels.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-07-23 17:13:44 -04:00
Ian Romanick 6c8f1f483a glsl: Compare vector indices in blocks
Just like the non-constant array index lowering pass, compare all N
indices at once.  For accesses to a vec4, this saves 3 comparison
instructions on a vector architecture.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick 90cc372400 glsl: Factor out code that generates block of index comparisons
Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick 156f85336f glsl: Treat ir_dereference_array of non-var as a constant for lowering
Previously the code would just look at deref->array->type to see if it
was a constant.  This isn't good enough because deref->array might be
another ir_dereference_array... of a constant.  As a result,
deref->array->type wouldn't be a constant, but
deref->variable_referenced() would return NULL.  The unchecked NULL
pointer would shortly lead to a segfault.

Instead just look at the return of deref->variable_referenced().  If
it's NULL, assume that either a constant or some other form of
anonymous temporary storage is being dereferenced.

This is a bit hinkey because most drivers treat constant arrays as
uniforms, but the lowering pass treats them as temporaries.  This
keeps the behavior of the old code, so this change isn't making things
worse.

Fixes i965 piglit:

    vs-temp-array-mat[234]-index-col-rd
    vs-temp-array-mat[234]-index-col-row-rd
    vs-uniform-array-mat[234]-index-col-rd
    vs-uniform-array-mat[234]-index-col-row-rd

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick 1d3f09f159 i965: When emitting a src/dst read of an output, keep the swizzle and neg
Fixes i965 piglit vs-varying-array-mat[234]-row-rd.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick 337e2dfad0 i965: When emitting a src/dst write of an output, keep the write mask
Fixes i965 piglit:

    vs-varying-array-mat[234]-col-row-wr
    vs-varying-array-mat[234]-index-col-row-wr
    vs-varying-array-mat[234]-index-row-wr
    vs-varying-array-mat[234]-row-wr
    vs-varying-mat[234]-col-row-wr
    vs-varying-mat[234]-row-wr

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-23 01:24:18 -07:00
Ian Romanick fbeb68e880 prog_optimize: Set unused regs to PROGRAM_UNDEFINED after CMP->MOV conversion
Leaving the unused registers with other values caused assertion
failures and other problems in places that blindly iterate over all
sources.

brw_vs_emit.c:1381: get_src_reg: Assertion `c->regs[file][index].nr !=
0' failed.

Fixes i965 piglit:

    vs-uniform-array-mat[234]-col-row-rd
    vs-uniform-array-mat[234]-index-col-row-rd
    vs-uniform-array-mat[234]-index-row-rd
    vs-uniform-mat[234]-col-row-rd

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick f7cd9a858c ir_to_mesa: Copy reladdr in src_reg(dst_reg) constructor
Fixes i965 piglit:

    vs-temp-array-mat[234]-col-row-wr
    vs-temp-array-mat[234]-index-col-row-wr
    vs-temp-array-mat[234]-index-row-wr
    vs-temp-mat[234]-col-row-wr

Fixes swrast piglit:

    fs-temp-array-mat[234]-col-row-wr
    fs-temp-array-mat[234]-index-col-row-wr
    fs-temp-array-mat[234]-index-row-wr
    fs-temp-mat[234]-col-row-wr
    vs-temp-array-mat[234]-col-row-wr
    vs-temp-array-mat[234]-index-col-row-wr
    vs-temp-array-mat[234]-index-row-wr
    vs-temp-mat[234]-col-row-wr

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick d6e1a8f714 ir_to_mesa: Add each relative address to the previous
This fixes many cases of accessing arrays of matrices using
non-constant indices at each level.

Fixes i965 piglit:

    vs-temp-array-mat[234]-index-col-rd
    vs-temp-array-mat[234]-index-col-row-rd
    vs-temp-array-mat[234]-index-col-wr
    vs-uniform-array-mat[234]-index-col-rd

Fixes swrast piglit:

    fs-temp-array-mat[234]-index-col-rd
    fs-temp-array-mat[234]-index-col-row-rd
    fs-temp-array-mat[234]-index-col-wr
    fs-uniform-array-mat[234]-index-col-rd
    fs-uniform-array-mat[234]-index-col-row-rd
    fs-varying-array-mat[234]-index-col-rd
    fs-varying-array-mat[234]-index-col-row-rd
    vs-temp-array-mat[234]-index-col-rd
    vs-temp-array-mat[234]-index-col-row-rd
    vs-temp-array-mat[234]-index-col-wr
    vs-uniform-array-mat[234]-index-col-rd
    vs-uniform-array-mat[234]-index-col-row-rd
    vs-varying-array-mat[234]-index-col-rd
    vs-varying-array-mat[234]-index-col-row-rd
    vs-varying-array-mat[234]-index-col-wr

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick 601428d2bb glsl: When lowering non-constant vector indexing, respect existing conditions
If the non-constant index was in the LHS of an assignment, any
existing condititon on that assignment would be lost.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick 5f83dfe5b7 glsl: When lowering non-constant array indexing, respect existing conditions
If the non-constant index was in the LHS of an assignment, any
existing condititon on that assignment would be lost.

Fixes i965 piglit:

    fs-temp-array-mat[234]-col-row-wr
    fs-temp-array-mat[234]-index-col-row-wr
    fs-temp-array-mat[234]-index-col-wr
    fs-temp-array-mat[234]-index-row-wr
    vs-varying-array-mat[234]-index-col-wr

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick 1731ac3086 glsl: Rework lowering of non-constant array indexing
The previous implementation could easily get tricked if the LHS of an
assignment included a non-constant index that was "inside" another
dereference.  For example:

    mat4 m[2];
    m[0][i] = vec4(0.0);

Due to the way it tracked whether the array was being assigned, it
would think that the non-constant index was in an r-value.  The new
code fixes that by tracking l-values and r-values differently.  The
index is also replaced by cloning the IR and replacing the index
variable instead of the odd way it was done before.

v2: Apply some simplifications suggested by Eric Anholt.  Making
assignment_generator::rvalue be ir_dereference instead of ir_rvalue
simplified the code a bit.

Fixes i965 piglit fs-temp-array-mat[234]-index-wr and
vs-varying-array-mat[234]-index-wr.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34691
Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick d2296e784a glsl: Split out part of variable_index_to_cond_assign_visitor::needs_lowering
Other code will soon need to know if an array needs lowering based
exclusively on the storage mode.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
Ian Romanick 8d5f3cef79 glsl: Move is_array_or_matrix outside visitor class
There's no reason for it to be there, and another class that may not
have access to the visitor will need it soon.

Reviewed-by: Eric Anholt <eric@anholt.net>
2011-07-23 01:24:18 -07:00
José Fonseca 5161aff48a gallivm: Add a note about log2 computation and denormalized numbers. 2011-07-22 18:52:09 -07:00
José Fonseca af82ff556c gallivm: Fix lp_build_exp2 order 4-5 polynomial coefficients and bump order.
Not sure how I computed these, but they were wrong (which explains why
bumping the polynomial order before never improved precision).

This allows to pass the EXP test cases of PSPrecision/VSPrecision DCTs.
2011-07-22 18:52:09 -07:00
José Fonseca 47d6d44a23 gallivm: Increase lp_build_rsqrt() precision.
Add an iteration step, which makes rqsqrt precision go from 12bits to
24, and fixes RSQ/NRM test case of PSPrecision/VSPrevision DCTs.

There are no uses of this function outside shader translation.
2011-07-22 18:52:09 -07:00
José Fonseca ef1a2765a4 gallivm: Update minimax comments. 2011-07-22 18:52:09 -07:00
José Fonseca 1ac86e249e gallivm: Fix lp_build_exp/lp_build_log.
Never used so far -- we only used the base 2 variants -- which is why
it went unnoticed so far.
2011-07-22 18:52:09 -07:00
José Fonseca 0a1d49504d llvmpipe: Unit tests for arithmetic functions.
Conflicts:

	src/gallium/drivers/llvmpipe/SConscript
2011-07-22 18:52:08 -07:00
José Fonseca eb7590f677 util: Store alpha value too. 2011-07-22 18:52:08 -07:00
Vinson Lee edaadd94cb glsl: Add standalone_scaffolding.cpp to SConscript. 2011-07-22 10:38:05 -07:00
Paul Berry 659cdedb53 glsl: Add unit tests for lower_jumps.cpp
These tests invoke do_lower_jumps() in isolation (using the glsl_test
executable) and verify that it transforms the IR in the expected way.

The unit tests may be run from the top level directory using "make
check".

For reference, I've also checked in the Python script
create_test_cases.py, which was used to generate these tests.  It is
not necessary to run this script in order to run the tests.

Acked-by: Chad Versace <chad@chad-versace.us>
2011-07-22 09:45:11 -07:00
Paul Berry f1f76e157e glsl: Create a standalone executable for testing optimization passes.
This patch adds a new build artifact, glsl_test, which can be used for
testing optimization passes in isolation.

I'm hoping that we will be able to add other useful standalone tests
to this executable in the future.  Accordingly, it is built in a
modular fashion: the main() function uses its first argument to
determine which test function to invoke, removes that argument from
argv[], and then calls that function to interpret the rest of the
command line arguments and perform the test.  Currently the only test
function is "optpass", which tests optimization passes.
2011-07-22 09:45:11 -07:00
Paul Berry f129f618fe glsl: Move functions into standalone_scaffolding.cpp for later reuse.
This patch moves the following functions from main.cpp (the main cpp
file for the standalone executable that is used to create the built-in
functions) to standalone_scaffolding.cpp, so that they can be re-used
in other standalone executables:

- initialize_context()*
- _mesa_new_shader()
- _mesa_reference_shader()

*initialize_context contained some code that was specific to main.cpp,
so it was split into two functions: initialize_context() (which
remains in main.cpp), and initialize_context_from_defaults() (which is
in standalone_scaffolding.cpp).
2011-07-22 09:45:11 -07:00
Paul Berry 12c22cab77 mesa: Add an ifndef guard around the definition of the INLINE macro
Several Mesa headers redundantly define the INLINE macro.  Adding this
guard prevents the compiler from complaining about macro redefinition.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad@chad-versace.us>
2011-07-22 09:45:11 -07:00
Younes Manton a87afba505 Revert "g3dvl: Preserve previously rendered components for MC output."
This reverts commit b56daf71d2.

The bug is actually in softpipe's blend and writemask interaction.
2011-07-21 20:09:11 -04:00
Brian Paul 636d01bd61 Merge branch 'gallium-polygon-stipple' 2011-07-21 10:38:21 -06:00
Brian Paul 57aa597b3d softpipe: use the polygon stipple utility module
This is an alternative to the draw module's polygon stipple stage.
The softpipe implementation here is just a test.  The advantange of
using the new polygon stipple utility module (with other drivers)
is we can avoid software vertex processing in the draw module and
get much better performance.
Polygon stipple doesn't require special vertex processing like
the other draw module stage.
2011-07-21 10:32:15 -06:00
Brian Paul c534f11164 softpipe: implement fragment shader variants
We'll need shader variants to accomodate the new polygon stipple utility.
2011-07-21 09:57:37 -06:00
Brian Paul 3dde6be908 util: assorted updates to polygon stipple helper 2011-07-21 09:57:37 -06:00
Brian Paul 4736c0ba86 softpipe: use tgsi_shader_info fields for fragcoord origin, center, etc. 2011-07-21 09:57:37 -06:00
Brian Paul 2253906da3 tgsi: add info fields for fragcoord origin, center, etc 2011-07-21 09:57:33 -06:00
Brian Paul 9c1319d31d softpipe: remove obsolete comment 2011-07-21 09:55:22 -06:00
Brian Paul f16d97feaa softpipe: rename a function 2011-07-21 09:55:22 -06:00
Brian Paul ecc6a26a3d Merge branch 'remove-copyteximage-hook' 2011-07-21 08:46:02 -06:00
Chia-I Wu afc160e1c8 u_vbuf_mgr: restore buffer offsets
u_vbuf_upload_buffers modifies the buffer offsets.  If they are not
restored, and any of the vertex formats is not supported natively, the
next u_vbuf_mgr_draw_begin call will translate the vertex buffers with
incorrect buffer offsets.
2011-07-21 21:20:37 +08:00
Marek Olšák 000896c0bb mesa: GLES2 should return different error enums for invalid fbo queries
ES 2.0.25 page 127 says:

  If the value of FRAMEBUFFER_ATTACHMENT_OBJECT_TYPE is NONE, then
  querying any other pname will generate INVALID_ENUM.

See also:
b9e9df78a0

NOTE: This is a candidate for the 7.10 and 7.11 branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-07-21 14:16:43 +02:00
Christoph Bumiller ea316c5e06 nouveau: hook up video decoding with nouveau_context
This doesn't include nvfx since its context struct is not derived
from common nouveau_context (yet).
2011-07-21 10:39:41 +02:00
Vinson Lee 76bccaff0c glsl: Add ir_function_detect_recursion.cpp to SConscript. 2011-07-20 20:16:27 -07:00
Ian Romanick 02c5ae1b3f glsl: Reject shaders that contain static recursion
The GLSL 1.20 and later specs say:

    "Recursion is not allowed, not even statically. Static recursion is
    present if the static function call graph of the program contains
    cycles."

Recursion is detected and rejected both a compile-time and at
link-time.  The complie-time check happens to detect some cases that
may be removed by various optimization passes.  The spec doesn't seem
to allow this, but other vendors (e.g., NVIDIA) appear to only check
at link-time after all optimizations.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=33885
Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-20 18:20:59 -07:00
Ian Romanick 1ad3ba4ad9 glsl: Make prototype_string publicly available
Also clarify the documentation for one of the parameters.

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-20 18:20:59 -07:00
Marek Olšák 2d960d3f4e g3dvl: remove unused vertex shader inputs
See also comments in the code.
2011-07-20 22:55:24 +02:00
Eric Anholt 3e5d36267d i965: Apply a homebrew workaround for GPU hang in OGLC api-texcoord.
The behavior of flushes in the hardware is a maze of twisty passages,
and strangely the VS constants appear to be loaded during a pipeline
flush instead of at the time of the packet emit according to the
simulator.  On moving the STATE_BASE_ADDRESS packet to where it really
needed to live (in order for data loads by other packets to be
correct), we sometimes no longer got a flush between those packets
where we apparently needed it.  This replicates the flushes implied by
a STATE_BASE_ADDRESS update, fixing the GPU hangs in OGLC and the
"engine" demo.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=36821
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39257
Tested-by: Keith Packard <keithp@keithp.com> (bzflag and etracer fixed)
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-20 11:42:53 -07:00
Eric Anholt 407785d0e9 i965: Enable the PIPE_CONTROL workaround workaround out of paranoia.
There's scary stuff going on in PIPE_CONTROL internals, and if the
BSpec says to do this to make PIPE_CONTROL work, I'll go ahead and do
it because we'll probably never be able to debug it after the fact.

v2: Use stall at scoreboard instead of depth stall, as noted by Ken.
2011-07-20 11:12:38 -07:00
Eric Anholt dc7422405f i965: Avoid kernel BUG_ON if we happen to wait on the pipe_control w/a BO.
For this and occlusion queries, we're trying to avoid setting
I915_GEM_DOMAIN_RENDER for the write domain, because the data written
is definitely not going through the render cache, but we do need to
tell the kernel that the object has been written.  However, with using
I915_GEM_DOMAIN_GTT, the kernel on retiring the batchbuffer sees that
the w/a BO has a write domain of GTT, and puts it on the flushing
list.  If something tries to wait for that BO to finish rendering
(such as the AUB dumper reading the contents of BOs), we get into
wait_request (since obj->active) but with a 0 seqno (since the object
is on the flushing list, not actually on a ringbuffer), and BUG_ONs.

To avoid the kernel bug (which I'm hoping to delete soon anyway), just
use I915_GEM_DOMAIN_INSTRUCTION like occlusion queries do.  This
doesn't result in more flushing, because we invalidate INSTRUCTION on
every batchbuffer now that we're state streaming, anyway.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-20 11:10:00 -07:00
Eric Anholt 540e66b3be intel: Use the GLSL-based meta clear when available.
Improves firefox-talos-gfx performance under GL when 3D clears are
enabled:
[  0]       gl-before     firefox-talos-gfx   20.193   20.251   0.27%    3/3
[  0]       gl-after      firefox-talos-gfx   18.013   18.040   0.19%    3/3
2011-07-20 11:03:26 -07:00
Eric Anholt eee570290a meta: Add a GLSL-based _mesa_meta_Clear() variant.
This cuts out a large portion of the overhead of glClear() from
resetting the texenv state and recomputing the fixed function
programs.  It also means less use of fixed function internally in our
GLES2 drivers, which is rather bogus.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-07-20 11:03:20 -07:00
Younes Manton b56daf71d2 g3dvl: Preserve previously rendered components for MC output.
Fixes xvmc-softpipe MC entrypoint, amongst others.
2011-07-20 13:52:45 -04:00
Younes Manton 8082816e27 g3dvl: Init/clean pipe fully when a shader-based decoder isn't used.
Fixes VDPAU CSC-only mode.
2011-07-20 13:52:45 -04:00
Kenneth Graunke 3875526926 glsl: Avoid massive ralloc_strndup overhead in S-Expression parsing.
When parsing S-Expressions, we need to store nul-terminated strings for
Symbol nodes.  Prior to this patch, we called ralloc_strndup each time
we constructed a new s_symbol.  It turns out that this is obscenely
expensive.

Instead, copy the whole buffer before parsing and overwrite it to
contain \0 bytes at the appropriate locations.  Since atoms are
separated by whitespace, (), or ;, we can safely overwrite the character
after a Symbol.  While much of the buffer may be unused, copying the
whole buffer is simple and guaranteed to provide enough space.

Prior to this, running piglit-run.py -t glsl tests/quick.tests with GLSL
1.30 enabled took just over 10 minutes on my machine.  Now it takes 5.

NOTE: This is a candidate for stable release branches (because it will
      make running comparison tests so much less irritating.)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-20 10:42:43 -07:00
Brian Paul 4470ff2ebf glsl: silence warning in linker.cpp 2011-07-19 21:10:25 -06:00
Brian Paul d5e3239776 st/mesa: get rid of redundant clipping code in st_copy_texsubimage() 2011-07-19 20:03:05 -06:00
Brian Paul 1c1fc62e38 mesa: remove unused dd_function_table::CopyTexImage1D/2D() hooks 2011-07-19 20:03:05 -06:00
Brian Paul 774311fb54 meta: remove _mesa_meta_CopyTexImage1D/2D() 2011-07-19 20:03:05 -06:00
Brian Paul 0823ef84a5 st/mesa: remove st_CopyTexImage1D/2D() 2011-07-19 20:03:05 -06:00
Brian Paul 9ed87c4463 radeon: remove radeonCopyTexImage2D() 2011-07-19 20:03:05 -06:00
Brian Paul fbe6836043 intel: remove intelCopyTexImage1D/2D() 2011-07-19 20:03:05 -06:00
Brian Paul 1da28fa959 mesa: remove comments referring to Driver.TexImage1D/2D 2011-07-19 20:03:05 -06:00
Brian Paul 5874890c26 mesa: stop using ctx->Driver.CopyTexImage1D/2D() hooks 2011-07-19 20:03:05 -06:00
Jørgen Lind 496bf3822a Make it possible to use gbm with c++
NOTE: This is a candiate for 7.11
2011-07-19 16:30:07 -07:00
Fredrik Höglund d84791a72b st/mesa: fix the texture format in st_context_teximage
Commit 1a339b6c71 made
st_ChooseTextureFormat map GL_RGBA with type GL_UNSIGNED_BYTE
to PIPE_FORMAT_A8B8G8R8_UNORM.

The image format for ARGB pixmaps is PIPE_FORMAT_B8G8R8A8_UNORM
however. This mismatch caused the texture to be recreated in
st_finalize_texture.

NOTE: This is a candidate for the 7.11 branch.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=39209
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Stéphane Marchesin <marcheu@chromium.org>
Signed-off-by: Brian Paul <brianp@vmware.com>
2011-07-19 15:28:43 -06:00
Brian Paul f0e306c343 mesa: update, shorten some comments in dd.h 2011-07-19 15:28:43 -06:00
Henri Verbeet 0f20e2e18f glx: Avoid calling __glXInitialize() in driReleaseDrawables().
This fixes a regression introduced by commit
a26121f375 (fd.o bug #39219).

Since the __glXInitialize() call should be unnecessary anyway, this is
probably a nicer fix for the original problem too.

NOTE: This is a candidate for the 7.10 and 7.11 branches.

Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Tested-by: padfoot@exemail.com.au
2011-07-19 23:27:46 +02:00
Chad Versace f7dbcba280 intel: Fix stencil buffer to be W tiled
Until now, the stencil buffer was allocated as a Y tiled buffer, because
in several locations the PRM states that it is. However, it is actually
W tiled. From the PRM, 2011 Sandy Bridge, Volume 1, Part 2, Section
4.5.2.1 W-Major Format:
    W-Major Tile Format is used for separate stencil.

The GTT is incapable of W fencing, so we allocate the stencil buffer with
I915_TILING_NONE and decode the tile's layout in software.

This fix touches the following portions of code:
    - In intel_allocate_renderbuffer_storage(), allocate the stencil
      buffer with I915_TILING_NONE.
    - In intel_verify_dri2_has_hiz(), verify that the stencil buffer is
      not tiled.
    - In the stencil buffer's span functions, the tile's layout must be
      decoded in software.

This commit mutually depends on the xf86-video-intel commit
    dri: Do not tile stencil buffer
    Author: Chad Versace <chad@chad-versace.us>
    Date:   Mon Jul 18 00:38:00 2011 -0700

On Gen6 with separate stencil enabled, fixes the following Piglit tests:
    bugs/fdo23670-drawpix_stencil
    general/stencil-drawpixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-copypixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-drawpixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX16-readpixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-copypixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-drawpixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX1-readpixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-copypixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-drawpixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX4-readpixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-copypixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-drawpixels
    spec/EXT_framebuffer_object/fbo-stencil-GL_STENCIL_INDEX8-readpixels
    spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-copypixels
    spec/EXT_packed_depth_stencil/fbo-stencil-GL_DEPTH24_STENCIL8-readpixels
    spec/EXT_packed_depth_stencil/readpixels-24_8

Note: This is a candidate for the 7.11 branch.

Signed-off-by: Chad Versace <chad@chad-versace.us>
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-19 13:12:19 -07:00
Eric Anholt fb5ff51f42 i965: Fix regression in 29a911c50e.
The previous define was the full 32-bit header, while the new define
was just the top 16 bits.
2011-07-19 12:20:14 -07:00
Brian Paul b38c26f19f llvmpipe: include LLVM version number in name string 2011-07-19 08:42:46 -06:00
Tobias Droste 3143e95353 llvmpipe: fix build with LLVM 3.0svn
LLVM 3.0svn introduced a new type system. It defines a new way to create
named structs and removes the (now not needed) LLVMInvalidateStructLayout
function.  See revision 134829 of LLVM.

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Signed-off-by: Brian Paul <brianp@vmware.com>
2011-07-19 08:23:28 -06:00
Marek Olšák 8c47a5da9f xvmc-softpipe: remove LLVM_LIBS
this is added conditionally in Makefile.xmvc

Spotted by Chris Rankin.
2011-07-18 23:41:45 +02:00
Kenneth Graunke 348bdaa529 i965: Rename CMD_VF_STATISTICS_(965|GM45) to include "3DSTATE".
Including the full "3DSTATE_VF_STATISTICS" should make it easier to
cross-reference the code and documentation.

Also, move the 965/GM45 suffix to the beginning for consistency with
newer #defines.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-18 14:14:36 -07:00
Kenneth Graunke 797522f1c9 i965: Rename CMD_VERTEX_(BUFFER|ELEMENT) to 3DSTATE_VERTEX_...S.
This makes our code use the same names as the documentation.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-18 14:14:36 -07:00
Kenneth Graunke 29a911c50e i965: Rename 3DSTATE_DRAWRECT_INFO_I965 to 3DSTATE_DRAWING_RECTANGLE.
The documentation uses 3DSTATE_DRAWING_RECTANGLE, and we already had it
defined in brw_defines.h; we were simply using an old #define from
intel_reg.h.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2011-07-18 14:14:36 -07:00
Eric Anholt cb5e0ba2aa i915: Simplify intel_wpos_* with a helper function. 2011-07-18 11:26:34 -07:00
Eric Anholt fceda4342c i915: Include gl_FragCoord.w data, not just xyz.
Fixes piglit fragcoord_w test.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34323
2011-07-18 11:26:33 -07:00
Eric Anholt af9548d335 i915: Add support for HW rendering with no color draw buffer.
This is useful for shadow map generation.  Tested with glsl-bug-22603,
which rendered the depth textures with fallbacks before.

Acked-by: Chad Versace <chad@chad-versace.us>
2011-07-18 11:26:33 -07:00
Eric Anholt debf751aea i915: Fix incorrect depth scaling when enabling/disabling depth buffers.
We were updating our new viewport using the old buffers' _WindowMap.m.
We can do less math and avoid using that deprecated matrix by just
folding the viewport calculation right in to the driver.

Fixes piglit fbo-depthtex.
2011-07-18 11:26:33 -07:00
Eric Anholt 79fee3a76b i915: Make stencil test for no-stencil handling match depth test.
i915_update_draw_buffers() already handles the fallback bit for
missing stencil region, so here we just need to handle whether the GL
thinks we have stencil data or not (and disable the test if so).
2011-07-18 11:26:33 -07:00
Eric Anholt fc4fba52cf i915: Disable the depth test whenever we don't have a depth buffer.
We were disabling it once at the moment we changed draw buffers, but
later enabling of depth test could turn it back on.  Fixes
fbo-nodepth-test.

Note that ctx->DrawBuffer has to be checked because during context
create we get called while it's still unset.  However, we know we'll
get an intel_draw_buffer() after that, so it's safe to make a silly
choice at this point.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=30080
2011-07-18 11:26:33 -07:00
Eric Anholt 4c47fce92e i915: Remove i965 paths from i915_update_drawbuffer() and i830's too.
Reviewed-by: Chad Versace <chad@chad-versace.us>
2011-07-18 11:26:33 -07:00
Eric Anholt 94efc350b4 i965: Remove i915 paths from brw_update_draw_buffers().
Reviewed-by: Chad Versace <chad@chad-versace.us>
2011-07-18 11:26:33 -07:00
Eric Anholt c68270a26b i965: Remove unused region calculations in brw_update_draw_buffer().
Reviewed-by: Chad Versace <chad@chad-versace.us>
2011-07-18 11:26:33 -07:00
Eric Anholt 15af0f54b8 i965: Remove empty brw_set_draw_region.
Reviewed-by: Chad Versace <chad@chad-versace.us>
2011-07-18 11:26:33 -07:00
Eric Anholt dd898c3e89 i965: Remove FALLBACK() from brw_update_draw_region().
The 965 driver doesn't use these for deciding on fallbacks.

Reviewed-by: Chad Versace <chad@chad-versace.us>
2011-07-18 11:26:33 -07:00
Eric Anholt f34ec6169d intel: Move intel_draw_buffers() code into each driver.
The illusion of shared code here wasn't fooling anybody.  It was
tempting to keep i830 and i915 still shared, but I think I actually
want to make them diverge shortly.

Reviewed-by: Chad Versace <chad@chad-versace.us>
2011-07-18 11:26:33 -07:00
Eric Anholt 8cf2741d2b intel: Clarify the depthRb == stencilRb logic.
Reviewed-by: Chad Versace <chad@chad-versace.us>
2011-07-18 11:26:33 -07:00
Eric Anholt 96cdbf4340 intel: Use the post-execution batchbuffer contents for dumping.
We were missing out on all the relocation changes by dumping what we
subdata()ed in instead of what's there after the kernel finished with
it.
2011-07-18 11:26:33 -07:00
Paul Berry f07221056e glsl: Ensure that sampler declarations are always uniform or "in" parameters.
This brings us into compliance with page 17 (page 22 of the PDF) of
the GLSL 1.20 spec:

    "[Sampler types] can only be declared as function parameters or
    uniform variables (see Section 4.3.5 "Uniform"). ... [Samplers]
    cannot be used as out or inout function parameters."

The spec isn't explicit about whether this rule applies to
structs/arrays containing shaders, but the intent seems to be to
ensure that it can always be determined at compile time which sampler
is being used in each texture lookup.  So to avoid creating a
loophole, the rule needs to apply to structs/arrays containing shaders
as well.

Fixes piglit tests spec/glsl-1.10/compiler/samplers/*.frag, and fixes
bug 38987.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38987
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2011-07-18 10:48:27 -07:00
Paul Berry ddc1c96390 glsl: Move type_contains_sampler() into glsl_type for later reuse.
The new location, as a member function of glsl_type, is more
consistent with queries like is_sampler(), is_boolean(), is_float(),
etc.  Placing the function inside glsl_type also makes it available to
any code that uses glsl_types.
2011-07-18 10:48:27 -07:00
Vadim Girlin 9b3ec69cf4 r600g: fix corner case checks for the queries 2011-07-18 08:53:47 -04:00
Henri Verbeet 3093cbaad9 r600g: Get rid of leftover PB_USAGE_* flags.
These happen to work because their values are the same as the equivalent
PIPE_TRANSFER_* flags, but it's still misleading.

Signed-off-by: Henri Verbeet <hverbeet@gmail.com>
2011-07-18 01:36:07 +02:00
Ian Romanick 66f4ac988d linker: Only over-ride built-ins when a prototype has been seen
The GLSL spec says:

    "If a built-in function is redeclared in a shader (i.e., a
    prototype is visible) before a call to it, then the linker will
    only attempt to resolve that call within the set of shaders that
    are linked with it."

This patch enforces this behavior.  When a function call is processed
a flag is set in the ir_call to indicate whether the previously seen
prototype is the built-in or not.  At link time a call will only bind
to an instance of a function that matches the "want built-in" setting
in the ir_call.

This has the odd side effect that first call to abs() in the shader
below will call the built-in and the second will not:

float foo(float x) { return abs(x); }
float abs(float x) { return -x; }
float bar(float x) { return abs(x); }

This seems insane, but it matches what the spec says.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=31744
2011-07-17 13:02:49 -07:00
Jeremy Huddleston 7eed3d4808 darwin: Include glxhash.c in libGL on darwin
Fixes a build regression introduced by 4df137691e

Signed-off-by: Jeremy Huddleston <jeremyhu@apple.com>
2011-07-16 22:02:55 -07:00
Emil Velikov 55b415ff77 xvmc-nouveau: Resolve build
The following resolves the build issues and missing symbols
Add "xvmc-nouveau/target.c" - missing symbol "driver_description"
Add "drivers/nvc0/libnvc0.a" - missing symbol "nvc0_screen_create"
Remove "drivers/softpipe/libsoftpipe.a" - unnessecary dependency
resolves build (when building without swrast)
Add "drivers/trace/libtrace.a" in Makefile

Note: With/without those patches xvmc-nouveau still segfaults
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
2011-07-16 11:21:48 -04:00
Christoph Bumiller 56503fd138 nv50: fix bogus error message about 3d surfaces 2011-07-16 13:00:52 +02:00
Vinson Lee 9228bfb375 gallivm: Rename createAsmInfo to createMCAsmInfo with llvm-3.0.
llvm-3.0svn r135219 renamed createAsmInfo to createMCAsmInfo in
include/llvm/Target/TargetRegistry.h.
2011-07-16 00:17:46 -07:00
Marek Olšák 7854b8cb16 xmlconfig: remove an unused-but-set variable
I hate gcc 4.6 already.
2011-07-15 21:48:29 +02:00
Marek Olšák 036fb07908 r600g: print to stderr that a CS has been rejected by the kernel
Just fixing the warning that r is unused.
2011-07-15 21:48:29 +02:00
Marek Olšák dade65505b prog_optimize: fix a warning that a variable may be uninitialized 2011-07-15 21:48:28 +02:00
Marek Olšák ed5e95ada6 r300/compiler: remove an unused-but-set variable and simplify the code 2011-07-15 21:48:28 +02:00
Marek Olšák 2ce6c3ea6e r300/compiler: fix a warning that a variable may be uninitialized 2011-07-15 21:48:28 +02:00
Marek Olšák 2f02c2fe56 st/mesa: remove unused-but-set variables in st_program.c 2011-07-15 21:48:28 +02:00
Marek Olšák 3032d064fb swrast: remove an unused-but-set variable 2011-07-15 21:48:28 +02:00
Marek Olšák eca3152de0 mesa: fix unused-but-set-variable warnings in dlist.c 2011-07-15 21:48:28 +02:00
Vadim Girlin ef29bfee03 r600g: fix queries and predication
Use all zpass data for predication instead of the last block only.
Use query buffer as a ring instead of reusing the same area
for each new BeginQuery. All query buffer offsets are in bytes
to simplify offsets math.
2011-07-15 15:42:46 -04:00
Marc Pignat cfec000e75 drisw: Fix 24bpp software rendering, take 2
This patch add the support for 24bpp in the dri/swrast implementation.
See http://bugs.freedesktop.org/show_bug.cgi?id=23525

Signed-off-by: Marc Pignat <marc at pignat.org>
Signed-off-by: Brian Paul <brianp@vmware.com>
2011-07-15 10:09:14 -06:00
Christian König 0d082390d9 g3dvl: no need for flushing inside the compositor any more
Move that also inside the state tracker where needed.
2011-07-15 17:54:06 +02:00
Christian König 2cbf532ae1 g3dvl: correctly distinct dst area and clip area in the compositor
Otherwise xine won't scale correctly.
2011-07-15 17:36:02 +02:00
Alex Deucher a3d23a4868 r600c/g: add new NI pci ids
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2011-07-15 10:55:02 -04:00
Christian König 1cf06218e4 g3dvl: link r300 and r600 targets width libdrm instead of libdrm_radeon 2011-07-15 10:45:31 +02:00
Marek Olšák a2381665d5 gallium/targets: link vdpau, va, and xvmc with LLVM libs when requested
Signed-off-by: Christian König <deathsimple@vodafone.de>
2011-07-15 10:31:07 +02:00
Christian König 13da00f07c g3dvl: change picture parameter of decode_bitstream to general version
Using pipe_mpeg12_picture_desc was unintentional here.
2011-07-15 10:22:51 +02:00
José Fonseca 9a7f84d6b2 Squashed commit of the following:
commit 1856230d9fa61710cce3e152b8d88b1269611a73
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date:   Tue Jul 12 23:41:27 2011 +0100

    make: Use better var names on packaging.

commit d1ae72d0bd14e820ecfe9f8f27b316f9566ceb0c
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date:   Tue Jul 12 23:38:21 2011 +0100

    make: Apply several of Dan Nicholson's suggestions.

commit f27cf8743ac9cbf4c0ad66aff0cd3f97efde97e4
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date:   Sat Jul 9 14:18:20 2011 +0100

    make: Put back the tar.bz2 creation rule.

    Removed by accident.

commit 34983337f9d7db984e9f0117808274106d262110
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date:   Sat Jul 9 11:59:29 2011 +0100

    make: Determine tarballs contents via git ls-files.

    The wildcards were a mess:
    - lots of files for non Linux platforms missing
    - several files listed and archived twice

    Using git-ls-files ensures things are not loss when making the tarballs.

commit 34a28ccbf459ed5710aafba5e7149e8291cb808c
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date:   Sat Jul 9 11:07:14 2011 +0100

    glut: Remove GLUT source.

    Most distros ship freeglut, and most people don't care one vs the other,
    and it hasn't been really maintained.

    So it is better to have Mesa GLUT be revisioned and built separately
    from Mesa.

commit 5c26a2c3c0c7e95ef853e19d12d75c4f80137e7d
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date:   Sat Jul 9 10:31:02 2011 +0100

    Ignore the tarballs.

commit 26edecac589819f0d0efe2165ab748dbc4e53394
Author: José Fonseca <jose.r.fonseca@gmail.com>
Date:   Sat Jul 9 10:30:24 2011 +0100

    make: Create the Mesa-xxx-devel symlink automatically.

    Also actually remote the intermediate uncompressed tarballs.
2011-07-14 17:35:05 +01:00
Dave Airlie b6df603e65 vbo: minor optimisation in vbo_exec_DrawRangeElements
this moves getting the context into the debug in this function,

just spotted it trawling callgrind traces for other things.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-07-14 15:22:58 +01:00
Brian Paul e5f7e09210 gallium: don't use enum bitfields in p_video_state.h
Silences many warnings about "type of bit-field ‘field_select’ is a
GCC extension".

Since the field sizes were 8 and 16 bits, just use basic types.
2011-07-14 08:14:14 -06:00
Brian Paul a5a9422561 gallium: put video-related enums in separate header
The forward references to video enum types in p_context.h causes
a massive number of compiler warnings (ISO C forbids forward references
to ‘enum’ types).

By putting the new video enums in a separate header that can be included
by p_context.h and p_screen.h we can avoid this.

Acked-by Christian König <deathsimple@vodafone.de>
2011-07-14 08:14:14 -06:00
Brian Paul 9726947b68 i915g: move declaration before code 2011-07-14 08:14:13 -06:00
Brian Paul db0f2b3637 mesa: use inline function wrapper for _mesa_reference_texobj() 2011-07-14 08:14:13 -06:00
Brian Paul 74142f1bf2 mesa: use inline function wrapper for _mesa_reference_renderbuffer() 2011-07-14 08:14:13 -06:00
Brian Paul 5db7723ada mesa: use inline function wrapper for _mesa_reference_framebuffer() 2011-07-14 08:14:08 -06:00
Brian Paul 6214963c00 main: use inline function wrapper for _mesa_reference_buffer_object() 2011-07-14 08:09:38 -06:00
Dave Airlie 323e4bff79 mesa: split _mesa_reference_program() into hot/cold paths.
inline the hotpath of the reference remaining the same. This shouldn't
penalise the slow path at all but improve the hot path so we don't have
to jump to the function.

It also moves some assert checks under an #ifndef NDEBUG.

Minor clean-ups added by Brian.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Brian Paul <brianp@vmware.com>
2011-07-14 08:09:38 -06:00
Christoph Bumiller 7e2827fad9 nv50,nvc0: extensive surface format renaming to get consistency
Now the component ordering is consistent and matches gallium again.
2011-07-14 12:51:06 +02:00
Christoph Bumiller b2dcf880e8 nv50,nvc0: add support for multi-sample resources 2011-07-14 12:51:06 +02:00
Christoph Bumiller c011f94b7b nv50,nvc0: add correct storage type for Z32_FLOAT 2011-07-14 12:51:06 +02:00
Christoph Bumiller cad17554c4 nv50,nvc0: unify nvc0_miptree and nv50_miptree structs
Share some functions and restructure miptree creation a little.
Prepare for multi-sample resources.
2011-07-14 12:51:06 +02:00
Christoph Bumiller ebeec1d43a nv50,nvc0: don't advertise unaligned texture format support
Because we don't support them.
For instance, R32G32B32 is not R32G32B32X32 as was assumed.

Add support for R8G8B8X8_UNORM instead of R8G8B8_UNORM surfaces.
2011-07-14 12:51:06 +02:00
Vinson Lee 3cf22a0c6e g3dvl: Remove non-constant expression array initializers.
The array initializer must be a constant expression in MSVC.
2011-07-13 21:57:50 -07:00
Marek Olšák 67aba799bc gallium/targets: do not link every driver with libllvmpipe.a
Only some targets need that, the others don't.
2011-07-14 03:03:26 +02:00
Marek Olšák 5fe54df58f Rename swrastg_dri to swrast_dri
I prefer it this way and it has been suggested earlier by others too.
Opinions?
2011-07-14 03:03:26 +02:00
Brian Paul b82db9a3c0 softpipe: fix various warnings about int/float/double conversions, etc 2011-07-13 18:54:31 -06:00
Vinson Lee f292d07b47 g3dvl: Remove designated initializers.
MSVC does not support designated initializers.
2011-07-13 17:00:26 -07:00
Vinson Lee 49967950a5 g3dvl: s/inline/INLINE/
The inline keyword is not available in MSVC C.
2011-07-13 15:59:08 -07:00