Commit Graph

53857 Commits

Author SHA1 Message Date
Michel Dänzer 1a616c1009 radeonsi: Flesh out support for depth/stencil exports from the pixel shader.
Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:16 +01:00
Michel Dänzer 49003a5cb6 radeonsi: Fix sampler views for depth textures.
Consistently reference the flushed depth texture in the sampler view, not the
original one.

Signed-off-by: Michel Dänzer <michel.daenzer@amd.com>
2012-11-28 13:35:16 +01:00
Jerome Glisse 3c024624fd radeonsi: Fix z/stencil texture creation.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>

[ Cherry-picked from r600g commit b4f0ab0b22 ]
2012-11-28 13:35:16 +01:00
Vinson Lee ffc318a97a scons: Build ws_xlib on Mac OS X.
Fixes this SCons build error on Mac OS X if X11 is found.

NameError: name 'ws_xlib' is not defined:
  File "SConstruct", line 144:
    duplicate = 0 # http://www.scons.org/doc/0.97/HTML/scons-user/x2261.html
  File "scons-2.2.0/SCons/Script/SConscript.py", line 614:
    return method(*args, **kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 551:
    return _SConscript(self.fs, *files, **subst_kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 260:
    exec _file_ in call_stack[-1].globals
  File "src/SConscript", line 34:
    SConscript('gallium/SConscript')
  File "scons-2.2.0/SCons/Script/SConscript.py", line 614:
    return method(*args, **kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 551:
    return _SConscript(self.fs, *files, **subst_kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 260:
    exec _file_ in call_stack[-1].globals
  File "src/gallium/SConscript", line 135:
    'targets/libgl-xlib/SConscript',
  File "scons-2.2.0/SCons/Script/SConscript.py", line 614:
    return method(*args, **kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 551:
    return _SConscript(self.fs, *files, **subst_kw)
  File "scons-2.2.0/SCons/Script/SConscript.py", line 260:
    exec _file_ in call_stack[-1].globals
  File "src/gallium/targets/graw-xlib/SConscript", line 9:
    ws_xlib,

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 23:13:57 -08:00
Johannes Obermayr 53636fdf93 configure.ac: Remove -O., -g and -Wall from LLVM_C{PP,XX}FLAGS.
Signed-off-by: Marek Olšák <maraeo@gmail.com>
2012-11-28 00:19:17 +01:00
Brian Paul f75acabb96 vbo: move another line of code after declarations
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-11-27 15:34:56 -07:00
Brian Paul 8765c0d20f vbo: move code after declarations to fix MSVC errors
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-11-27 14:28:56 -07:00
Brian Paul f94e672b47 vbo: minor whitespace fix 2012-11-27 13:56:52 -07:00
Brian Paul a547e532fc mesa: remove '(void) k' lines
Serves no purpose as the k parameter is used later in the code.
2012-11-27 13:56:52 -07:00
Kenneth Graunke 7a414fea87 mesa/vbo: Check for invalid types in various packed vertex functions.
According to the ARB_vertex_type_2_10_10_10_rev specification:
"The error INVALID_ENUM is generated by VertexP*, NormalP*,
 TexCoordP*, MultiTexCoordP*, ColorP*, or SecondaryColorP if <type>
 is not UNSIGNED_INT_2_10_10_10_REV or INT_2_10_10_10_REV."

Fixes 7 subcases of oglconform's packed-vertex test.

v2: Add "gl" prefix to error messages (pointed out by Brian).
    Also rebase atop the ctx plumbing.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 12:36:33 -08:00
Kenneth Graunke 6a529e2b48 mesa/vbo: Support the ES 3.0 signed normalized scaling rules.
Traditionally, OpenGL has had two separate equations for converting from
signed normalized fixed-point data to floating point data.  One was used
primarily for vertex data, while the other was primarily for texturing
and framebuffer data.

However, ES 3.0 and GL 4.2 change this, declaring there's only one
equation to be used in all cases.  Unfortunately, it's the other one.

v2: Correctly convert 0b10 to -1.0, as pointed out by Chris Forbes.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2012-11-27 12:36:33 -08:00
Kenneth Graunke c8d8d5db72 mesa/vbo: Plumb ctx through to the conv_i(10|2)_to_norm_float functions.
The rules for converting these values actually depend on the current
context API and version.  The next patch will implement those changes.

v2: Mark ctx as const, as suggested by Brian.

Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2012-11-27 12:36:33 -08:00
Matt Turner 13f9012ad3 mesa: Set transform feedback's default buffer mode to INTERLEAVED_ATTRIBS
Fixes part of es3conform's transform_feedback_init_defaults test.
NOTE: This is a candidate for the stable branch.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 10:40:50 -08:00
Matt Turner 7c2060f0f0 mesa: Return 0 for XFB_VARYING_MAX_LENGTH if no varyings
v2: Perform this count the same way as elsewhere in this file, per
    Brian Paul's review.

Fixes part of es3conform's transform_feedback_init_defaults test.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-27 10:40:49 -08:00
Andreas Boll f65741721b gallium/tests/trivial: updates for transfer functions changes
Fixes build error with configure option --enable-gallium-tests
introduced in 369e468889

Compile tested only.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-11-27 19:00:48 +01:00
Andreas Boll cba639f2a1 gallium/tests/trivial: updates for CSO interface changes
Fixes build error with configure option --enable-gallium-tests
introduced in ea6f035ae9

Cc: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-11-27 19:00:48 +01:00
Andreas Boll 1553f5ce83 gallium/tests/trivial: updates for util_draw_vertex_buffer changes
Fixes build error with configure option --enable-gallium-tests
introduced in e73bf3b805

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2012-11-27 19:00:48 +01:00
James Benton 9bd4856b5c util: Modified u_rect to default to memcpy.
Previously this function would assert if the format didn't fit an expected 4 channel format size.

Now will work with any format type with any amount of channels.

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 16:24:42 +00:00
James Benton 65016646e3 util/format: Fix bug in float to non-float conversion in u_format_pack.py.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 16:24:02 +00:00
James Benton 978df710f2 gallivm: Fix bug in lp_build_one which would incorrectly return a vector for length 1.
Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 16:23:04 +00:00
Kenneth Graunke 9bc9895c4a glsl: Support unsigned integer constants in layout qualifiers.
Fixes es3conform's explicit_attrib_location_integer_constants.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-and-tested-by: Matt Turner <mattst88@gmail.com>
2012-11-26 21:02:45 -08:00
Kenneth Graunke 9136723214 i965/fs: Move struct brw_compile (p) entirely inside fs_generator.
The brw_compile structure contains the brw_instruction store and the
brw_eu_emit.c state tracking fields.  These are only useful for the
final assembly generation pass; the earlier compilation stages doesn't
need them.

This also means that the code generator for future hardware won't have
access to the brw_compile structure, which is extremely desirable
because it prevents accidental generation of Gen4-7 code.

v2: rzalloc p, as suggested by Eric.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke ea681a0d64 i965/fs: Split final assembly code generation out of fs_visitor.
Compiling shaders requires several main steps:

   1. Generating FS IR from either GLSL IR or Mesa IR
   2. Optimizing the IR
   3. Register allocation
   4. Generating assembly code

This patch splits out step 4 into a separate class named "fs_generator."

There are several reasons for doing so:

   1. Future hardware has a different instruction encoding.  Splitting
      this out will allow us to replace fs_generator (which relies
      heavily on the brw_eu_emit.c code and struct brw_instruction) with
      a new code generator that writes the new format.

   2. It reduces the size of the fs_visitor monolith.  (Arguably, a lot
      more should be split out, but that's left for "future work.")

   3. Separate namespaces allow us to make helper functions for
      generating instructions in both classes: ADD() can exist in
      fs_visitor and create IR, while ADD() in fs_generator() can
      create brw_instructions.  (Patches for this upcoming.)

Furthermore, this patch changes the order of operations slightly.
Rather than doing steps 1-4 for SIMD8, then 1-4 for SIMD16, we now:

   - Do steps 1-3 for SIMD8, then repeat 1-3 for SIMD16
   - Generate final assembly code for both modes together

This is because the frontend work can be done independently, but final
assembly generation needs to pack both into a single program store to
feed the GPU.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke dd1fd30047 i965/fs: Abort on unsupported opcodes rather than failing.
Final code generation should never fail.  This is a bug, and there
should be no user-triggerable cases where this could occur.

Also, we're not going to have a fail() method in a moment.

v2: Just abort() rather than assert, to cover the NDEBUG case
    (suggested by Eric).

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke cd0acb1abe i965: Make it possible to create a cfg_t without a backend_visitor.
All we really need is a memory context and the instruction list; passing
a backend_visitor is just convenient at times.

This will be necessary two patches from now.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke 4d09fe938e i965/fs: Move uses of brw_compile from do_wm_prog to brw_wm_fs_emit.
The brw_compile structure is closely tied to the Gen4-7 hardware
encoding.  However, do_wm_prog is very generic: it just calls out to
get a compiled program and then uploads it.

This isn't ultimately where we want it, but it's a step in the right
direction: it's now closer to the code generator.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:34 -08:00
Kenneth Graunke 3417b2f2b2 i965/fs: Pass the brw_context pointer into fs_visitor explicitly.
We used to steal it out of the brw_compile struct...but fs_visitor
isn't going to have one of those in the future.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke 1f74002a98 i965/fs: Move brw_wm_compile::fp to fs_visitor.
Also change it from a brw_fragment_program to a gl_fragment_program,
since that seems to be what everything wants anyway.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke 7b0d30eb87 i965/fs: Remove struct brw_shader * parameter to fs_visitor constructor.
We can easily recover it from prog, and this makes it clear that we
aren't passing additional information in.

v2: Use an if-statement rather than the ?: operator (suggested by Eric).

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke a303df86de i965/fs: Move brw_wm_compile::dispatch_width into fs_visitor.
Also, rather than having brw_wm_fs_emit poke at it directly, make it a
parameter to the fs_visitor constructor.

All other changes generated by search and replace (with occasional
whitespace fixup).

v2: Make dispatch_width const (as suggested by Paul); fix doxygen
    mistake (pointed out by Eric); update for rebase.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke 47a6a7b51b i965/fs: Move brw_wm_lookup_iz() to fs_visitor::setup_payload_gen4().
This necessitates compiling brw_wm_iz.c as C++.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke 2429c9d347 i965/fs: Move brw_wm_payload_setup() to fs_visitor::setup_payload_gen6()
Now that we only have the one backend, there's no real point in keeping
this separate.  Moving it should allow some future simplifications.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Kenneth Graunke ce96f6db90 i965/fs: Remove brw_wm_compile::computes_depth field.
Everybody determines this by checking if fp's OutputsWritten field
contains the FRAG_RESULT_DEPTH bit.  Rather than having payload setup
check this and set the computes_depth flag, we can just do the check in
the only place that actually used it: emit_fb_writes().

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Paul Berry <stereotype441@gmail.com>
2012-11-26 19:52:33 -08:00
Roland Scheidegger 529fe420ba gallivm: use the new mip per quad handling in texture fetch path
No longer have to split fetching into quads dynamically if mip levels
are not the same for all quads (aos sampling still always splits due
to performance reasons).
Instead handle multiple mip levels further down, minification etc. takes
this into account.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 03:30:55 +01:00
Roland Scheidegger 0b6554ba6f gallivm,llvmpipe: handle TXF (texelFetch) instruction, including offsets
This also adds some code to handle per-quad lods for more than 4-wide fetches,
because otherwise I'd have to integrate the texelFetch function into
the splitting stuff... (but it is not used yet outside texelFetch).
passes piglit fs-texelFetch-2D, fails fs-texelFetchOffset-2D due to I believe
a test error (results are undefined for out-of-bounds fetches, we return
whatever is at offset 0, whereas the test expects [0,0,0,1]).
Texel offsets are only handled by texelFetch for now, though the interface
can handle it for everything.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-11-27 03:26:49 +01:00
Chris Forbes 93c689a2df i965: Enable ARB_vertex_type_2_10_10_10_rev on Gen4+.
v2 (Kayden): Move the enable into an existing intel->gen >= 4 block
(as suggested by Ian).

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:48:29 -08:00
Chris Forbes 4a64efc01b i965: emit w/a for packed attribute formats in VS
Implements BGRA swizzle, sign recovery, and normalization
as required by ARB_vertex_type_10_10_10_2_rev.

V2: Ported to the new VS backend, since that's all that's left;
	fixed normalization.

V3: Moved fixups out of the GLSL-only path, so it works for FF/VP too.

V4 (Kayden): Rework ES3 normalization, don't heap allocate registers;
	tidy comments.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:35:10 -08:00
Chris Forbes 352ae51efd i965: set attribute w/a bits for packed formats
Flag the need for various workarounds to be applied by
the vertex shader.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:35:00 -08:00
Chris Forbes c3c680950d i965: Generalize GL_FIXED VS w/a support
Next few patches build on this to add other workarounds
for packed formats.

V2: rename BRW_ATTRIB_WA_COMPONENTS to BRW_ATTRIB_WA_COMPONENT_MASK;
V3 (Kayden): remove separate bit for ES3 signed normalization

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:34:28 -08:00
Chris Forbes 23f4411c41 i965: support 2_10_10_10 formats in get_surface_type.
Always use R10G10B10A2_UINT; Most of the other formats we'd like
don't actually work on the hardware. Will emit w/a for scaling,
sign recovery and BGRA swizzle in the VS.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:34:23 -08:00
Chris Forbes f9a08f7f0f i965: implement get_size for 2_10_10_10 formats
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 16:34:20 -08:00
Chris Forbes 894fe54ec9 i965/vs: add support for emitting SHL, SHR, ASR
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-26 14:02:30 -08:00
Matt Turner 8f3570efc7 mesa: Use correct glGetTransformFeedbackVarying name in error msg
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-11-26 10:08:05 -08:00
Andreas Boll 0f5e2ce854 build: use git ls-files for adding all Makefile.in into the release tarball
Until we have proper 'make dist' this is an improvement of the current
situation, because each time some old Makefiles got converted to automake
we had to update the tarballs target.

NOTE: This is a candidate for the 9.0 branch.

Cc: Eric Anholt <eric@anholt.net>
Acked-by: Matt Turner <mattst88@gmail.com>
2012-11-26 19:03:21 +01:00
Eric Anholt 97747ac88f i965: Fix hangs with FP KIL instructions pre-gen6.
We can't support IF statements in 16-wide on these.  To get back to 16-wide
for these shaders, we need to support predicate on discard instructions in the
backend IR, which is something we've sort of got on the list to do anyway.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55828
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-11-25 20:22:02 -08:00
Eric Anholt 59bfd66a61 i965/gen4: Fix memory leak each time compile_gs_prog() is called.
Commit 774fb90db3 introduced a ralloc context to
each user of struct brw_compile, but for this one a NULL context was used,
causing the later ralloc_free(mem_ctx) to not do anything.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55175
NOTE: This is a candidate for the stable branches.
2012-11-25 18:25:26 -08:00
Eric Anholt 244db0855c i965/gen4: Fix LOD bias texturing since my fixed reg classes change.
We have a special case where non-shadow comparison with LOD requires using a
SIMD16 vec4 in an 8-wide shader, which appears in the register allocator as a
size 8 vgrf.

Fixes assertions in various piglit tests and webgl conformance.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=56521
2012-11-25 18:25:26 -08:00
Marek Olšák cff4c948ed r600g: fix broken streamout if streamout_begin caused a context flush
This fixes graphics corruption in the case where the DISCARD_RANGE flag
is used to map a buffer.

NOTE: This is a candidate for the stable branches.
2012-11-23 00:42:02 +01:00
Marek Olšák d172fa825b r600g: fix ARB_map_buffer_alignment with unaligned offsets and staging buffers 2012-11-22 22:40:06 +01:00
Vinson Lee f884005771 scons: Append x11 library path if linking x11 library.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
2012-11-21 22:34:20 -08:00