Commit Graph

54179 Commits

Author SHA1 Message Date
Matt Turner 6acabe33a3 mesa: print unsigned values with %u
Otherwise messages say silly things like
   glGetActiveUniformBlockiv(block index -1 >= 0)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-13 09:53:11 -08:00
Kenneth Graunke 200bb36778 i965: Fix disassembly of jump targets on Gen7.
Gen7 stores the JIP/UIP bits in different places.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-12 22:19:08 -08:00
Kenneth Graunke c2eb9d3a0a i965: Make try_rewrite_rhs_to_dst compare VGRF size to regs written.
try_rewrite_rhs_to_dst is a quick optimization to avoid generating new
temporaries (and MOVs from those temporaries to the dest) for every
expression tree we visit.  By generating better code in simple cases, we
reduce the burden on later optimization passes like register coalescing.

Previously, we compared inst->regs_written() to lhs->vector_elements
to make sure the instruction generating our value wrote the same number
of components as our destination register.

However, this fails in some cases.  One example is texturing (which
produces a vec4) into gl_FragData[i].  Technically, gl_FragData[i] is
also a vec4.  However, the destination VGRF actually has size 4n (where
n is the size of the array).

split_virtual_grfs() can't split VGRFs that are used by SEND messages
which require contiguous destination registers (like texturing), and
register allocation needs all VGRFs to have sizes between 1 and 4.

Amnesia: The Dark Descent hits this case: a texturing instruction
(4 components) gets rewritten to the gl_FragData output register
(which was 4*3 = 12 components), causing the register allocator to
hit the "we rely on split_virtual_grfs" assertion.

This makes it possible to play Amnesia.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-12 14:44:37 -08:00
Emil Velikov 1223458764 configure.ac: Disable compiler optimizations when --enable-debug is set
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Dan Nicholson <dbn.lists@gmail.com>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2012-12-12 14:48:06 -06:00
Brian Paul e721a76e68 softpipe: remove unused corner0 variable 2012-12-12 08:51:19 -07:00
Brian Paul 8ef27e8fa9 llvmpipe: remove unneeded draw_flush() call
This is redundant since we're calling draw_bind_fragment_shader()
which already does a flush.

v2: the redundant flush in llvmpipe_set_constant_buffer() has
already been removed by commit 3427466e6d

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2012-12-12 08:45:45 -07:00
Marek Olšák d225d076a9 r600g: suballocate memory for fetch shaders from a large buffer
Fetch shaders are usually destroyed at the context destruction by the state
tracker, so we can put them all in a large buffer without wasting memory.

This reduces the number of relocations sent to the kernel a little bit.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:12:31 +01:00
Marek Olšák 8df3855eed r600g: suballocate memory for the STRMOUT_BUFFER_FILLED_SIZE register
Instead of having a 4-byte buffer for each streamout target, we suballocate
each dword from a 4K buffer.

This further reduces the overall number of relocations.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:12:28 +01:00
Marek Olšák cc2d908572 gallium/util: add a simple allocator for suballocating from a large buffer
Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:12:24 +01:00
Marek Olšák 2478fcd87c r600g: use u_upload_mgr for allocating staging transfer buffers
u_upload_mgr suballocates memory from a large buffer and maps the allocated
range (unsychronized), which is perfect for short-lived staging buffers.

This reduces the number of relocations sent to the kernel.

Tested-by: Aaron Watry <awatry@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-12 13:11:52 +01:00
Marek Olšák 448cd5ea60 winsys/radeon: don't use BIND flags, add a flag for the cache bufmgr instead 2012-12-12 13:09:54 +01:00
Marek Olšák 1d0bf69f83 st/dri: add a way to force MSAA on with an environment variable
There are 2 ways. I prefer the former:
  GALLIUM_MSAA=n
  __GL_FSAA_MODE=n

Tested with ETQW, which doesn't support MSAA on Linux. This is
the only way to get MSAA there.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:54 +01:00
Marek Olšák afa902a705 mesa: don't advertise ARB_texture_buffer_object in legacy contexts
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-12 13:09:54 +01:00
Marek Olšák 0ac83a2001 mesa: disallow creation of GL 3.1 compatibility contexts
Death to driver-specific hacks!

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2012-12-12 13:09:54 +01:00
Marek Olšák 25409c6da8 gallium: remove pipe_surface::usage
Not really used by anybody now.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:54 +01:00
Marek Olšák c1f704073b svga: stop using pipe_surface::usage
There are only 2 possible usages: render target and depth stencil.
Both can be derived from the surface format, so the flag is redundant.

And it's going away...

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák 21b1ec69fc gallium/util: move util_try_blit_via_copy_region to u_surface.c
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák 3a555637b2 gallium/cso: don't use the pipe_error return type where it's not needed
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák eae9674f18 gallium: manage render condition in cso_context and fix postprocessing w/ it
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Marek Olšák 9ec6ffd85d st/mesa: remove a weird msaa hack
It doesn't work and it's not clear how it's supposed to work.

Reviewed-by: Brian Paul <brianp@vmware.com>
2012-12-12 13:09:53 +01:00
Dave Airlie 621259b3de softpipe: implement seamless cubemap support. (v1.1)
This adds seamless sampling for cubemap boundaries if requested.

The corner case averaging is messy but seems like it should be spec
compliant.

The face direction stuff is also a bit messy, I've no idea if that could
or should be simpler, or even if all my directions are fully correct!

v1.1: update comments, drop unneeded seamless calls for nearest, fix
if statement layout.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-12 10:35:05 +10:00
Dave Airlie 3392f2fbcf gallium: fix cap warnings for tbo cap.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-12 07:16:02 +10:00
Dave Airlie 5cdcd7251a glsl_to_tgsi: emit multi-level structs and arrays properly.
This follow the code from the i965 driver, and emits the structs
and arrays recursively.

This fixes an assert in the two UBO tests
fs-struct-copy-complicated and
vs-struct-copy-complicated

These tests now pass on softpipe, with no regressions.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-12 06:57:38 +10:00
Brian Paul 2ee0b44252 llvmpipe: don't use user constant buffers
This fixes some use-after-free issues.  I haven't measured any real
performance difference with a handful of Mesa demos.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-11 12:48:07 -07:00
Brian Paul 3427466e6d llvmpipe: support pipe_resource-based constant buffers
Before this we only supported user-based constant buffers.

First, we basically plumb pipe_constant_buffer objects through llvmpipe
rather than pipe_resource objects.

Second, update llvmpipe_set_constant_buffer() and try_update_scene_state()
so they understand both resource- and user-based constant buffers.

The problem with user constant buffers is the potential for use-after-free,
as seen in some WebGL tests.  The next patch will flip the switch for
resource-based const buffers.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-11 12:48:06 -07:00
Brian Paul 4c6053dc51 util: add util_copy_constant_buffer() helper function
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2012-12-11 12:48:06 -07:00
Eric Anholt beafced21c i965/fs: Improve performance of shaders that start out with a discard.
I had tried this in the past, but ran into trouble with applications
that sample from undiscarded pixels in the same subspan.  To fix that
issue, only jump to the end for an entire subspan at a time.

Improves GLbenchmark 2.7 (1024x768) performance by 7.9 +/- 1.5% (n=8).

v2: Drop the br variable in the jump instruction -- if I ever do jumps
    pre-gen6, it'll be a different code block anyway since we don't have
    HALT until gen6.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:13:15 -08:00
Eric Anholt d5016495cc i965/fs: Rewrite discards to use a flag subreg to track discarded pixels.
This makes much more sense on gen6+, and will also prove useful for
early exit of shaders on discard.

v2: fix up a stale comment from before converting gen4-5.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:13:08 -08:00
Eric Anholt b278f65e1c i965/fs: Add an instruction flag for choosing the flag subregister.
We're going to redo discard handling to track discards in the other flag
subregister, saving instructions in the discard and allowing predicated
jumps out to the end of the shader.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:12:58 -08:00
Eric Anholt 2c69a9fb60 i965: Let brw_flag_reg() choose the flag reg and subreg.
We're about to start using the f0.1 subregister.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:12:54 -08:00
Eric Anholt 6a1490bc8f i965: Print the flag reg updated by conditional modifiers.
This makes our output more consistent with other disasm tools, and
will be necessary when we start using f0.1.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:12:49 -08:00
Eric Anholt b7fd4b3f94 i965: Add the new flag_reg_nr instruction field from IVB.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:12:47 -08:00
Eric Anholt f606a42a3c i965: Correct the name and usage of the flag subregister number field.
We've been calling it a register number, it's actually the subregister,
and things will get confusing once we start using it if it isn't fixed.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:12:41 -08:00
Eric Anholt 7d404a4bd8 i965: Remove bogus flag_reg_nr field from bits3.
There's a flag subreg nr field in bits2 next to src0.vertstride, but
there shouldn't be anything in bits3 next to src1.vertstride.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2012-12-11 10:11:44 -08:00
Tobias Droste cb8300f5a9 st/egl/drm: only unref the udev device if needed
Fixes compiler warning:

drm/native_drm.c: In function ‘native_create_display’:
drm/native_drm.c:180:21: warning: ‘device’ may be used uninitialized in this function [-Wmaybe-uninitialized]
drm/native_drm.c:157:24: note: ‘device’ was declared here

Signed-off-by: Tobias Droste <tdroste@gmx.de>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2012-12-11 12:53:58 -05:00
José Fonseca bc4bf3c840 softpipe: Use os_time_get_nano() everywhere. 2012-12-11 16:45:01 +00:00
Johannes Obermayr b361bb3de4 clover: Install CL headers.
Note: This is a candidate for the stable branches.
2012-12-10 19:22:37 -05:00
Tom Stellard ffe1794e0c gallivm: Lower TGSI_OPCODE_MUL to fmul by default
This fixes a number of crashes on r600g due to the fact that
lp_build_mul assumes vector types when optimizing mul to bit shifts.

This bug was uncovered by 0ad1fefd69
2012-12-10 19:22:37 -05:00
Dave Airlie 8000e7b4b6 llvmpipe: fix txq for 1d/2d arrays. (v3)
Noticed would fail, we were doing two things wrong

a) 1d arrays require the layers in height
b) minifying the layers field.

v2: don't change height code, fixup completely inside txq
as suggested by Roland.

v3: just add minify before texture array size

v1: Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-11 09:38:01 +10:00
Dave Airlie 41f4f094c4 llvmpipe: increase texture target width to reflect increase
Now that we've gone over 7.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-11 09:37:55 +10:00
Jordan Justen 0151237457 mesa syncobj: don't store a pointer to the set_entry
The set_entry pointer can become invalid if the set table
is re-hashed.

This likely will fix
https://bugs.freedesktop.org/show_bug.cgi?id=58012
(Regression since 56e95d3c)

Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-10 10:58:45 -08:00
Fabio Pedretti 8b6e782eb9 vega: remove unused variables
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-12-10 09:43:20 -07:00
Fabio Pedretti eefd373876 nvc0: comment unused nvc0_validate_zcull function
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-12-10 09:43:18 -07:00
Fabio Pedretti 9b4926b64b nv50: remove unused OpClassStr array
Signed-off-by: Brian Paul <brianp@vmware.com>
2012-12-10 09:43:17 -07:00
smoki 320d531373 r200: fix broken tcl lighting
command mistakenly used vector instead of scalar emit (the more or less
identical code in radeon is already correct).
Seems like it would be broken ever since kms probably.
Should fix bugs 22576, 26809.
2012-12-10 17:30:26 +01:00
Dave Airlie 17f5dc5730 st_glsl_to_tgsi: fix ubo bools.
This should fix the ubo boolean tests, along with the previous
ubo loading fix.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-10 14:25:49 +10:00
Dave Airlie 7a66c8acd3 st_glsl_to_tgsi: call ubo load pass earlier
This calls it in around the same place as the 965 driver.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-10 14:22:34 +10:00
Dave Airlie af2d9affb1 glsl_to_tgsi: fix texture offset translation
I noticed the texelFetch offset test failed on 2D rect samplers
with GLSL 1.40. This is because I wrote the immediate->offset
translation wrong.

Fixed the translation to actually use the ureg info to set the
offsets up.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-10 12:23:47 +10:00
Dave Airlie 157f5d043a drisw: fix up context and apis for software context
This ports over from the dri2 code to the drisw bits. It means 3.1
core contexts now work for softpipe.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2012-12-09 20:28:56 +10:00
Kenneth Graunke bd87441ac0 i965: Add missing _NEW_BUFFERS dirty bit in Gen7 SBE state.
This is needed to compute render_to_fbo.  It even has the comment.

NOTE: This is a candidate for stable branches.

Reviewed-by: Eric Anholt <eric@anholt.net>
2012-12-08 18:12:21 -08:00