Commit Graph

53833 Commits

Author SHA1 Message Date
Marek Olšák a84a8da4f8 configure.ac: print LLVM flags
to see what we're mixing with ours

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2012-11-29 00:07:27 +01:00
Brian Paul 0904973e39 util: add more memory debugging features
Add a DEBUG_FREED_MEMORY option to help catch use-after-free errors.
Add debug_memory_check() function which can be periodically called to
check that all known blocks are good.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-28 15:03:29 -07:00
José Fonseca 1cead8845b llvmpipe: Implement logic ops for the AoS path.
It was forgotten in the previous patch series, but it is trivial to
implement, based on the SoA path.

This fixes glean logicOp failures.
2012-11-28 20:45:18 +00:00
José Fonseca 547efc76df llvmpipe: Don't use dynamically sized arrays.
Unfortunately for MSVC arrays with a constant variable size are still
considered dynamically sized.
2012-11-28 19:58:47 +00:00
Eric Anholt c8ed9f6262 i965/gen4-5: Fix segfaults with stencil-only depth/stencil setups.
Fixes a ton of piglit regressions since the depthstencil fixes for gen6+.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57309
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-28 11:26:41 -08:00
Eric Anholt b9b033d8e4 i965/fs: Don't generate saturates over existing variable values.
Fixes a crash in http://workshop.chromeexperiments.com/stars/ on i965,
and the new piglit test glsl-fs-clamp-5.
We were trying to emit a saturating move into a uniform, which the code
generator appropriately choked on.  This was broken in the change in
32ae8d3b32.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57166
NOTE: This is a candidate for the 9.0 branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-28 11:26:34 -08:00
Eric Anholt 154ef07aa7 i965/fs: Add some minimal backend-IR dumping.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-28 11:26:33 -08:00
James Benton 960ab06da0 llvmpipe: Update llvmpipe_is_format_unswizzled to reflect latest changes.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton 66fdf626bb llvmpipe: Enable vertex color clamping.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton fa1b481c09 llvmpipe: Unswizzled rendering.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton 1d3789bccb gallivm: Updated lp_build_const_mask_aos to input number of channels.
Also updated lp_build_const_mask_aos_swizzled to reflect this.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton d03d29a044 util: Updated util_format_is_array to be more accurate.
Will allow formats with padding, e.g. RGBX.
Will now allow swizzled formats as long as the alpha is channel 3.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton e66ec7c46b gallivm: Added support for float to half-float conversion in lp_build_conv.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton d7a8390a82 gallivm: Changed lp_build_pad_vector to correctly handle scalar argument.
Removed the lp_type argument as it was unnecessary.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton 71c6fe76c0 gallivm: Add a function to generate lp_type for a format.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:36 +00:00
James Benton cd548836a1 gallivm: Add support for unorm16 in lp_build_mul.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-11-28 19:14:20 +00:00
Matt Turner c3a465ae98 glcpp: Support #elif(expression) with no intervening space.
And add test cases to ensure that this works
	- 110 verifies that glcpp rejects #elif<digits> which glcpp
	  previously accepted.
	- 111 verifies that glcpp accepts #if followed immediately by
	  (, +, -, !, or ~.
	- 112 does the same as 111 but for #elif.

See 17f9beb6 for #if change.
Reviewed-by: Carl Worth <cworth@cworth.org>
2012-11-28 10:27:02 -08:00
Matt Turner aed466192a glcpp: Reject #version and #line not followed by whitespace
Fixes part of es3conform's preprocess16_frag test.
Reviewed-by: Carl Worth <cworth@cworth.org>
2012-11-28 10:26:53 -08:00
Marek Olšák 91ca053714 mesa: fix BlitFramebuffer between linear and sRGB formats
NOTE: This is a candidate for the stable branches.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-28 18:48:22 +01:00
Roland Scheidegger 406b76ca32 gallivm: fix multiple lods with different min/mag filter and wide vectors
broken since 529fe420ba,
I forgot some code, only added the comment...
Fixes bug 57644.
2012-11-28 18:07:27 +01:00
Michel Dänzer 6e33b55ee1 radeonsi: Reinstate assertions against invalid colour/depth formats.
radeonsi now supports Z16 and doesn't fail these assertions anymore.

This partially reverts commit 7bba4879bb, but
leaves the error messages in place to allow diagnosing such problems even with
non-debugging builds.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-11-28 15:48:50 +01:00
Michel Dänzer a8d46d0173 radeonsi: Re-enable Z16 depth buffers.
8 more piglits.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
2012-11-28 13:53:54 +01:00
Marek Olšák 726fe54cbc radeonsi: remove redundant parameter in r600_init_surface
[ Cherry-picked from r600g commit f5ac60152b ]
2012-11-28 13:35:17 +01:00
Michel Dänzer fa83d52961 radeonsi: Use explicit stencil mipmap level offsets.
Extracted from r600g commit 428e37c2da.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:17 +01:00
Marek Olšák 39b56afaa2 radeonsi: correct texture memory size for Z32F_S8X24
[ Cherry-picked from r600g commit ea72351a91 ]
2012-11-28 13:35:17 +01:00
Michel Dänzer 20f651d003 radeonsi: Depth/stencil fixes.
Adapted from r600g commit 018e3f75d6.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:17 +01:00
Michel Dänzer 1a616c1009 radeonsi: Flesh out support for depth/stencil exports from the pixel shader.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:16 +01:00
Michel Dänzer 49003a5cb6 radeonsi: Fix sampler views for depth textures.
Consistently reference the flushed depth texture in the sampler view, not the
original one.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:16 +01:00
Jerome Glisse 3c024624fd radeonsi: Fix z/stencil texture creation.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>

[ Cherry-picked from r600g commit b4f0ab0b22 ]
2012-11-28 13:35:16 +01:00
Vinson Lee ffc318a97a scons: Build ws_xlib on Mac OS X.
Fixes this SCons build error on Mac OS X if X11 is found.

NameError: name 'ws_xlib' is not defined:
  File "SConstruct", line 144:
    duplicate = 0 # http://www.scons.org/doc/0.97/HTML/scons-user/x2261.html
  File "scons-2.2.0/SCons/Script/SConscript.py", line 614:
    return method(*args, **kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 551:
    return _SConscript(self.fs, *files, **subst_kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 260:
    exec _file_ in call_stack[-1].globals
  File "src/SConscript", line 34:
    SConscript('gallium/SConscript')
  File "scons-2.2.0/SCons/Script/SConscript.py", line 614:
    return method(*args, **kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 551:
    return _SConscript(self.fs, *files, **subst_kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 260:
    exec _file_ in call_stack[-1].globals
  File "src/gallium/SConscript", line 135:
    'targets/libgl-xlib/SConscript',
  File "scons-2.2.0/SCons/Script/SConscript.py", line 614:
    return method(*args, **kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 551:
    return _SConscript(self.fs, *files, **subst_kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 260:
    exec _file_ in call_stack[-1].globals
  File "src/gallium/targets/graw-xlib/SConscript", line 9:
    ws_xlib,

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 23:13:57 -08:00
Johannes Obermayr 53636fdf93 configure.ac: Remove -O., -g and -Wall from LLVM_C{PP,XX}FLAGS.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-11-28 00:19:17 +01:00
Brian Paul f75acabb96 vbo: move another line of code after declarations
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-11-27 15:34:56 -07:00
Brian Paul 8765c0d20f vbo: move code after declarations to fix MSVC errors
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-27 14:28:56 -07:00
Brian Paul f94e672b47 vbo: minor whitespace fix 2012-11-27 13:56:52 -07:00
Brian Paul a547e532fc mesa: remove '(void) k' lines
Serves no purpose as the k parameter is used later in the code.
2012-11-27 13:56:52 -07:00
Kenneth Graunke 7a414fea87 mesa/vbo: Check for invalid types in various packed vertex functions.
According to the ARB_vertex_type_2_10_10_10_rev specification:
"The error INVALID_ENUM is generated by VertexP*, NormalP*,
 TexCoordP*, MultiTexCoordP*, ColorP*, or SecondaryColorP if <type>
 is not UNSIGNED_INT_2_10_10_10_REV or INT_2_10_10_10_REV."

Fixes 7 subcases of oglconform's packed-vertex test.

v2: Add "gl" prefix to error messages (pointed out by Brian).
    Also rebase atop the ctx plumbing.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 12:36:33 -08:00
Kenneth Graunke 6a529e2b48 mesa/vbo: Support the ES 3.0 signed normalized scaling rules.
Traditionally, OpenGL has had two separate equations for converting from
signed normalized fixed-point data to floating point data.  One was used
primarily for vertex data, while the other was primarily for texturing
and framebuffer data.

However, ES 3.0 and GL 4.2 change this, declaring there's only one
equation to be used in all cases.  Unfortunately, it's the other one.

v2: Correctly convert 0b10 to -1.0, as pointed out by Chris Forbes.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2012-11-27 12:36:33 -08:00
Kenneth Graunke c8d8d5db72 mesa/vbo: Plumb ctx through to the conv_i(10|2)_to_norm_float functions.
The rules for converting these values actually depend on the current
context API and version.  The next patch will implement those changes.

v2: Mark ctx as const, as suggested by Brian.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2012-11-27 12:36:33 -08:00
Matt Turner 13f9012ad3 mesa: Set transform feedback's default buffer mode to INTERLEAVED_ATTRIBS
Fixes part of es3conform's transform_feedback_init_defaults test.
NOTE: This is a candidate for the stable branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 10:40:50 -08:00
Matt Turner 7c2060f0f0 mesa: Return 0 for XFB_VARYING_MAX_LENGTH if no varyings
v2: Perform this count the same way as elsewhere in this file, per
    Brian Paul's review.

Fixes part of es3conform's transform_feedback_init_defaults test.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 10:40:49 -08:00
Andreas Boll f65741721b gallium/tests/trivial: updates for transfer functions changes
Fixes build error with configure option --enable-gallium-tests
introduced in 369e468889

Compile tested only.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-11-27 19:00:48 +01:00
Andreas Boll cba639f2a1 gallium/tests/trivial: updates for CSO interface changes
Fixes build error with configure option --enable-gallium-tests
introduced in ea6f035ae9

Cc: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-11-27 19:00:48 +01:00
Andreas Boll 1553f5ce83 gallium/tests/trivial: updates for util_draw_vertex_buffer changes
Fixes build error with configure option --enable-gallium-tests
introduced in e73bf3b805

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-11-27 19:00:48 +01:00
James Benton 9bd4856b5c util: Modified u_rect to default to memcpy.
Previously this function would assert if the format didn't fit an expected 4 channel format size.

Now will work with any format type with any amount of channels.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 16:24:42 +00:00
James Benton 65016646e3 util/format: Fix bug in float to non-float conversion in u_format_pack.py.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 16:24:02 +00:00
James Benton 978df710f2 gallivm: Fix bug in lp_build_one which would incorrectly return a vector for length 1.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 16:23:04 +00:00
Kenneth Graunke 9bc9895c4a glsl: Support unsigned integer constants in layout qualifiers.
Fixes es3conform's explicit_attrib_location_integer_constants.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
2012-11-26 21:02:45 -08:00
Kenneth Graunke 9136723214 i965/fs: Move struct brw_compile (p) entirely inside fs_generator.
The brw_compile structure contains the brw_instruction store and the
brw_eu_emit.c state tracking fields.  These are only useful for the
final assembly generation pass; the earlier compilation stages doesn't
need them.

This also means that the code generator for future hardware won't have
access to the brw_compile structure, which is extremely desirable
because it prevents accidental generation of Gen4-7 code.

v2: rzalloc p, as suggested by Eric.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke ea681a0d64 i965/fs: Split final assembly code generation out of fs_visitor.
Compiling shaders requires several main steps:

   1. Generating FS IR from either GLSL IR or Mesa IR
   2. Optimizing the IR
   3. Register allocation
   4. Generating assembly code

This patch splits out step 4 into a separate class named "fs_generator."

There are several reasons for doing so:

   1. Future hardware has a different instruction encoding.  Splitting
      this out will allow us to replace fs_generator (which relies
      heavily on the brw_eu_emit.c code and struct brw_instruction) with
      a new code generator that writes the new format.

   2. It reduces the size of the fs_visitor monolith.  (Arguably, a lot
      more should be split out, but that's left for "future work.")

   3. Separate namespaces allow us to make helper functions for
      generating instructions in both classes: ADD() can exist in
      fs_visitor and create IR, while ADD() in fs_generator() can
      create brw_instructions.  (Patches for this upcoming.)

Furthermore, this patch changes the order of operations slightly.
Rather than doing steps 1-4 for SIMD8, then 1-4 for SIMD16, we now:

   - Do steps 1-3 for SIMD8, then repeat 1-3 for SIMD16
   - Generate final assembly code for both modes together

This is because the frontend work can be done independently, but final
assembly generation needs to pack both into a single program store to
feed the GPU.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke dd1fd30047 i965/fs: Abort on unsupported opcodes rather than failing.
Final code generation should never fail.  This is a bug, and there
should be no user-triggerable cases where this could occur.

Also, we're not going to have a fail() method in a moment.

v2: Just abort() rather than assert, to cover the NDEBUG case
    (suggested by Eric).

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00