Commit Graph

44292 Commits

Author SHA1 Message Date
Eric Anholt d1f70a8a6c i965/fs: Split the GLSL IR -> FS LIR visitor to brw_fs_visitor.cpp.
We now have:
brw_fs.cpp handles calling out to everything and optimization.
brw_fs_visitor.cpp handles translating to our LIR.
brw_fs_emit.cpp handles emitting from our LIR to native code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-27 08:51:40 -07:00
Eric Anholt 11dd9e9c0f i965/fs: Split the BRW native code emit to brw_fs_emit.cpp
This is all separate from the visitor and the optimization passes
which feed into it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-27 08:51:39 -07:00
Eric Anholt b7b700aeb0 i965: Move a couple of GLSL IR -> BRW helper functions to brw_shader.cpp.
These will be used by the VS backend as well.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-27 08:51:30 -07:00
Eric Anholt 14b86f3c91 i965: Move non-FS-specific shader support to brw_shader.cpp.
These only existed in brw_fs.cpp because it was the only .cpp file in
the area when I wrote them.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-27 08:51:05 -07:00
Eric Anholt 53c89c67f3 i965: Avoid generating MOVs for assignments of expressions.
No statistically significant difference measured in 3dbenchmark
egypt/pro.  It does reduce fragment shader instructions across
shader-db by 0.3%.
2011-05-27 08:19:52 -07:00
Eric Anholt 1791857d7d i965/fs: Move the computation of register block count from unit to compile.
No net code size change, but unit update is down 0.8% code size
pre-gen6.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-27 08:19:27 -07:00
Eric Anholt 615117ce4e i965/fs: Track fixed GRF regs separate from allocated GRF file in scheduling.
There's an assumption here that fixed GRFs will never intersect with
the allocated GRFs.  That's true today, though it might change some
day if we decide to register-allocate the regs containing push
constants once they're dead.

This fixes a regression in 0f7325b890 in
Lightsmark from the texture instructions now containing g0 references
instead of having that be implied.  Performance is improved 15.2% +/-
3.6% (n=3).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34968
2011-05-27 08:08:23 -07:00
Eric Anholt 40540cc517 i965/fs: Add a helper function for add_dep(before, after, before->latency).
This lets us avoid a bunch of before==NULL checks in the callers.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-27 08:08:23 -07:00
Trevor Davenport 828b26b7eb nv50: fix emit_add_a16 to emit correct source reg
emit_add_a16 was using the incorrect source.
This caused adds in the form of:

   add u16 $a0 s32 $a1 u32 0x00000200

to have a source AREG of $a0 instead of $a1.

Fixes World of Warcraft in OpenGL and D3D without GLSL.
2011-05-27 10:25:40 +02:00
Brian Paul 4609e80288 mesa: s/height/depth/ in texsubimage()
Fixes http://bugs.freedesktop.org/show_bug.cgi?id=37648
2011-05-26 19:25:44 -06:00
Brian Paul e401c1f57f mesa: plug _mesa_GetObjectParameterivAPPLE into display list dispatch table 2011-05-26 19:25:44 -06:00
Brian Paul 6126d50e75 mesa: plug in GL_ARB_vertex_array_object display list functions 2011-05-26 19:25:44 -06:00
Brian Paul e00481586c mesa: more geometry shader display list functions 2011-05-26 19:25:44 -06:00
Brian Paul 3b0f431820 mesa: more transform feedback display list functions 2011-05-26 19:25:44 -06:00
Brian Paul 919e260bff mesa: make query object API functions static
Only directly referenced by the _mesa_init_queryobj_dispatch() function.
2011-05-26 19:25:44 -06:00
Brian Paul 848bcd2e8c mesa: simplify query object display list dispatch setup 2011-05-26 19:25:44 -06:00
Eric Anholt f7b3f40b70 i965: Pack the lookup and line_aa bits into the first dword of the key.
They were occupying whole 32-bit words, despite being only 10 or so
bits.  Reduces code size slightly (80/3300 bytes).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-26 10:07:38 -07:00
Eric Anholt 9a729ab4b2 i965: Remove dead shadowtex_mask entry in the WM key.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-26 10:07:38 -07:00
Eric Anholt f147599ef4 i965: Remove linear_color for GL_PERSPECTIVE_CORRECTION_HINT.
From the GL 2.1 spec:

   "Required perspective-correct interpolation for all fragment
    attributes except depth in sections 3.4.1 and 3.5.1, effectively
    making GL PERSPECTIVE CORRECT HINT a no-op."

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-26 10:07:38 -07:00
Eric Anholt c095335fa5 intel: Drop doubly irrelevant code in intelReadBuffers.
First, FBO read/draw == NULL validation happens in mesa core not
intelReadBuffers -> intel_draw_buffers.  Second, that condition is no
longer tested for in our driver since ARB_ES2_compatibility was added.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-05-26 08:54:29 -07:00
Eric Anholt 6d4b974e89 mesa: Flush vertices before updating drawbuffer computed state.
Otherwise, the driver is likely to draw the flushed vertices to the
new drawbuffer instead of the old one, missing the point of the flush.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-05-26 08:54:29 -07:00
Eric Anholt d3451f7f9c mesa: Allow NULL read/draw in complete FBOs in ARB_ES2_compatibility.
From the ARB_ES2_compatibility spec:

    "(8) How should we handle draw buffer completeness?

    RESOLVED: Remove draw/readbuffer completeness checks, and treat
    drawbuffers referring to missing attachments as if they were NONE."

Fixes arb_es2_compatibility-drawbuffers when the short-circuit for
ARB_ES2_compatibility in the previous commit is dropped.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-05-26 08:54:29 -07:00
Eric Anholt f73ff463a2 mesa: Trigger FBO validation on DrawBuffers change in non-ES2 mode.
glDrawBuffers pointing at an unattached buffer is supposed to be
incomplete without ARB_ES2_compatibility.  The testcase to catch the
bug of not implementing that bit of the spec was tricked by this
missing piece of state update.

Reviewed-by: Brian Paul <brianp@vmware.com>
2011-05-26 08:54:29 -07:00
Brian Paul 179a88d52c mesa: minor whitespace fixes 2011-05-25 21:07:50 -06:00
Brian Paul f84be846ca mesa: plug in sync object display list functions
Most just dispatch through to the immediate mode functions, except
for glWaitSync(), per the extension spec.
2011-05-25 21:06:51 -06:00
Brian Paul 95fa22c864 mesa: display list support for glProgramParameteriARB() 2011-05-25 20:44:35 -06:00
Brian Paul 001aa6c979 mesa: plug shader object functions into display list dispatch 2011-05-25 20:39:08 -06:00
Brian Paul 4535c98cdb mesa: plug in GL 3.0 ClearBuffer() display list functions 2011-05-25 20:27:44 -06:00
Brian Paul 8f7c815568 mesa: fill in missing sampler object display list functions 2011-05-25 20:20:22 -06:00
Brian Paul 3e06803c2c st/mesa: simplify some st_context(ctx)->pipe code 2011-05-25 18:16:03 -06:00
Brian Paul bf14ab417c st/mesa: fix incorrect texture level/face/slice accesses
If we use FBOs to access mipmap levels with glRead/Draw/CopyPixels()
we need to be sure to access the correct mipmap level/face/slice.
Before, we were just passing zero in quite a few places.

This fixes the new piglit fbo-mipmap-copypix test.

NOTE: This is a candidate for the 7.10 branch.
2011-05-25 18:07:35 -06:00
Jakob Bornecrantz 1697dac642 i915g: Bump texture sizes
Spotted and tested by Christopher Egert.

Signed-off-by: Jakob Bornecrantz <wallbraker@gmail.com>
2011-05-25 22:06:11 +02:00
Eric Anholt b5846865de i965: Warnings cleanup.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-25 11:16:36 -07:00
Eric Anholt fa42de5ad7 i965: Fix assertion failures in unused brw_reg setup by deleting it.
I was using undefined values to create an unused value.  Go me.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=37366
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2011-05-25 11:16:36 -07:00
Alex Deucher 5ed7a7b720 r600g: remove duplicate opcode in r600_opcodes.h
V_SQ_CF_WORD1_SQ_CF_INST_HALT is 0x1f on both
evergreen and cayman.

Reported-by: Gustaw Smolarczyk <wielkiegie@gmail.com>
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
2011-05-25 11:04:25 -04:00
Chad Versace e7bcfadc22 intel: Change FBO validation criteria to accomodate hiz and seprate stencil
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-05-25 07:41:32 -07:00
Chad Versace ce8fdf666f intel: Fix intel_draw_buffer() to accomodate hiz and separate stencil
The logic of intel_draw_buffers() expected that stencil buffers were
always combined depth/stencil.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-05-25 07:41:32 -07:00
Chad Versace c270f1a628 intel: Add hiz_region to intel_mipmap_tree
When a texture is attached to multiple FBO's, a separate renderbuffer
wrapper is created for each attachment. This necessitates storing the hiz
region for these renderbuffers in the texture itself instead of the
renderbuffer wrapper.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-05-25 07:41:32 -07:00
Chad Versace 6ed829fe50 intel: Refactor the wrapping of textures with renderbuffers
Before this commit, the renderbuffer's region was updated in
intel_renderbuffer_texture(). This commit moves the update into
intel_update_wrapper(), which is a more logical location for updates.

This is in preparation for the next commit, which allocates and
updates the texture's hiz region in intel_update_wrapper(). Having the two
region updates located in the same function makes good form.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-05-25 07:41:32 -07:00
Chad Versace 7c0e6d9bbc intel: Add hiz_region to intel_renderbuffer
A hiz surface must be supplied to the hardware when rendering to a depth
buffer with hiz. There are three potential places to store that surface:
    1. Allocate a larger intel_region for the depthbuffer, and let the
       region's tail be the hiz surface.
    2. Allocate a separate intel_region for hiz, and store it as
       brw_context state.
    3. Allocate a separate intel_region for hiz, and store it in
       intel_renderbuffer.

We choose method 3.

Method 1 has not been chosen due to future complications it might cause
when requesting a DRI drawable's depth buffer attachment from X.

Method 2 has not been chosen because storing the hiz region apart from
the depth region makes lazy hiz/depth resolves difficult to implement.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-05-25 07:41:32 -07:00
Chad Versace a9e6509785 intel: Add is_hiz_depth_format() to intel_contex.vtbl
Given a format, is_hiz_depth_format() indicates if HiZ can be enabled on
a depthbuffer of that format.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-05-25 07:41:32 -07:00
Chad Versace 1a1411e09b intel: Allocate region for separate stencil buffer
... in intel_alloc_renderbuffer_storage().  The stencil buffer has quirky
pitch requirements, so its region allocation is a special case.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-05-25 07:41:32 -07:00
Chad Versace b5c847c7ca intel: Change supported texture formats for separate stencil
When hardware supports separate stencil, enable support for separate
depth/stencil texture formats in the table
intel_context.ctx.TextureFormatsSupported. If the hardware must use
separate stencil, then disable support for combined depth/stencil formats.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-05-25 07:41:31 -07:00
Chad Versace 4e0654ec29 mesa: Add MESA_FORMAT_X8_Z24 to _mesa_choose_tex_format
Prefer MESA_FORMAT_X8_Z24 over MESA_FORMAT_S8_Z24 for textures with
internal format GL_DEPTH_COMPONENT*.

i965 needs MESA_FORMAT_X8_Z24 for HiZ and separate stencil.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-05-25 07:41:31 -07:00
Chad Versace 76f77cb07e intel: Add flags to intel_context for hiz and separate stencil
Add the following flags:
    intel_context.has_separate_stencil
    intel_context.must_use_separate_stencil
    intel_context.has_hiz

The flags are currently set to false, and will be enabled for a given
chipset once the feature is completely implemented.

Since it may be some time before these features are completed, their
values can be overridden with environment variables INTEL_HIZ and
INTEL_SEPARATE_STENCIL. Valid values for these environment variables are
"0" and "1".

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Chad Versace <chad@chad-versace.us>
2011-05-25 07:41:31 -07:00
Adam Jackson a95ec18549 glx: Don't refer to the request buffer outside of {L,Unl}ockDisplay
... because that's not a safe thing to do.  The request buffer is shared
storage among all threads, and after UnlockDisplay the 'req' pointer may
point into someone else's request.

NOTE: This is a candidate for the 7.10 branch.

Signed-off-by: Adam Jackson <ajax@redhat.com>
2011-05-25 06:19:29 -04:00
Alex Deucher c44dad559a egl_dri2: add new cayman pci ids
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
2011-05-25 01:27:34 -04:00
Alex Deucher 017cd5dcc3 r600g: fix eg/cayman scissor workaround
Signed-off-by: Alex Deucher <alexdeucher@gmail.com>
2011-05-24 22:44:16 -04:00
Dave Airlie 868c04205c r600g: add workaround for buggy hw scissor on eg/cayman.
This is ported from the same fix to the DDX.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-25 11:50:17 +10:00
Dave Airlie 7779f6d1df r600g: add initial cayman acceleration support.
Cayman is the RadeonHD 69xx series of GPUs. This adds support for
3D acceleration to the r600g driver.

Major changes:
Some context registers moved around - mainly MSAA and clipping/guardband related.
GPR allocation is all dynamic
no vertex cache - all unified in texture cache.
5-wide to 4-wide shader engines (no scalar or trans slot)
	- some changes to how instructions are placed into slots
	- removal of END_OF_PROGRAM bit in favour of END flow control clause
	- no vertex fetch clause - TC accepts vertex or texture

Signed-off-by: Dave Airlie <airlied@redhat.com>
2011-05-25 11:42:45 +10:00