Commit Graph

34989 Commits

Author SHA1 Message Date
Dave Airlie 6da8129b3c r600g: add missing eg reg definition 2010-10-13 17:45:10 +10:00
Dave Airlie 92e729bba5 r600g: evergreen add stencil export bit 2010-10-13 17:40:32 +10:00
Dave Airlie 88c1b32c62 r600g: use blitter for hw copy region
at the moment depth copies are failing (piglit depth-level-clamp)
so use the fallback for now until get some time to investigate.
2010-10-13 15:55:49 +10:00
Dave Airlie f8778eeb40 r600g: drop all use of unsigned long
this changes size on 32/64 bit so is definitely no what you want to use here.
2010-10-13 15:55:48 +10:00
Dave Airlie e9acf9a3bb r600g: fix transfer stride.
fixes segfaults
2010-10-13 15:55:48 +10:00
Dave Airlie e3b089126c r600g: remove bpt and start using pitch_in_bytes/pixels.
this mirror changes in r300g, bpt is kinda useless when it comes to some
of the non-simple texture formats.
2010-10-13 15:55:48 +10:00
Dave Airlie fa797f12b3 r600g: rename pitch in texture to pitch_in_bytes 2010-10-13 15:55:47 +10:00
Dave Airlie 6a0066a69f r600g: use common texture object create function 2010-10-13 15:55:47 +10:00
Dave Airlie 771dd89881 r600g: split out miptree setup like r300g
just a cleanup step towards tiling
2010-10-13 15:55:47 +10:00
Dave Airlie 9979d60c0e r600g: add copy into tiled texture 2010-10-13 15:55:46 +10:00
Dave Airlie 5604276670 r600g: the vs/ps const arrays weren't actually being used.
completely removed them.
2010-10-13 15:56:12 +10:00
Dave Airlie d59498b780 r600g: reduce size of context structure.
this thing will be in the cache a lot, so having massive big struct
arrays inside it won't be helping anyone.
2010-10-13 15:25:00 +10:00
Vinson Lee 8c107e6ca6 tdfx: Silence unused variable warning on non-debug builds.
Fixes this GCC warning.
tdfx_texman.c: In function 'tdfxTMMoveOutTM_NoLock':
tdfx_texman.c:897: warning: unused variable 'shared'
2010-10-12 22:23:21 -07:00
Dave Airlie c8d4108fbe r600g: store samplers/views across blit when we need to modify them
also fixup framebuffer state copies to avoid bad state.
2010-10-13 15:11:30 +10:00
Dave Airlie a8d1d7253e r600g: fix scissor/cliprect confusion
gallium calls them scissors, but r600 hw like r300 is better off using
cliprects to implement them as we can turn them on/off a lot easier.
2010-10-13 15:11:30 +10:00
Dave Airlie 833b4fc11e r600g: fix depth0 setting 2010-10-13 15:11:30 +10:00
Vinson Lee 71fa3f8fe2 r300: Silence uninitialized variable warning.
Fixes this GCC warning.
r300_state.c: In function 'r300InvalidateState':
r300_state.c:2247: warning: 'hw_format' may be used uninitialized in this function
r300_state.c:2247: note: 'hw_format' was declared here
2010-10-12 22:02:27 -07:00
Brian Paul 39de9251c4 mesa: reformatting, comments, code movement 2010-10-12 19:04:05 -06:00
Brian Paul 048a90c1cb draw/llvmpipe: replace DRAW_MAX_TEXTURE_LEVELS with PIPE_MAX_TEXTURE_LEVELS
There's no apparent reason for the former to exist.  And they didn't
even have the same value.
2010-10-12 19:04:05 -06:00
Brian Paul 50f221a01b gallivm: remove newlines 2010-10-12 19:04:05 -06:00
Roland Scheidegger c1549729ce gallivm: fix different handling of [non]normalized coords in linear soa path
There seems to be no reason for it, so do same math for both
(except the scale mul, of course).
2010-10-13 02:35:05 +02:00
Brian Paul 1ca5f7cc31 mesa: remove assertion w/ undeclared variable texelBytes 2010-10-12 18:32:06 -06:00
Dave Airlie 5f612f5c00 st/mesa: enable stencil shader export extension if supported 2010-10-13 09:30:05 +10:00
Dave Airlie d9671863ea glsl: add support for shader stencil export
This adds proper support for the GL_ARB_shader_stencil_export extension
to the GLSL compiler. Thanks to Ian for pointing out where I need to add things.
2010-10-13 09:30:05 +10:00
Dave Airlie 39d1feb51e r600g: add shader stencil export support. 2010-10-13 09:30:05 +10:00
Dave Airlie 40acb109de r600g: add support for S8, X24S8 and S8X24 sampler formats. 2010-10-13 09:30:04 +10:00
Dave Airlie ef8bb7ada9 st/mesa: use shader stencil export to accelerate shader drawpixels.
If the pipe driver has shader stencil export we can accelerate DrawPixels
using it. It tries to pick an S8 texture and works its way to X24S8 and S8X24
if that isn't supported.
2010-10-13 09:30:04 +10:00
Dave Airlie 06642c6175 st/mesa: add option to choose a texture format that we won't render to.
We need a texture to put the drawpixels stuff into, an S8 texture is less
memory/bandwidth than the 32-bit X24S8, but we might not be able to render
directly to an S8, so this lets us specify we won't be rendering to this
texture.
2010-10-13 09:30:04 +10:00
Dave Airlie d8f6ef4565 softpipe: add support for shader stencil export capability
this allows softpipe to be used to test shader stencil ref exporting.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-13 09:30:04 +10:00
Dave Airlie c79e681a68 mesa: improve texstore for 8/24 formats and add texstore for S8.
this improves mesa texstore for 8/24 so it can create S24X8/X24S8 variants
by keeping the depth bits static.

it also adds a texstore for S8 so we can write out an S8 texture to use
in the sampler for accel draw pixels to save memory bw.

The logic seems sound here, I've worked it out a few times on paper, though
it would be good to have some review.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-13 09:30:04 +10:00
Dave Airlie bec341d00c mesa: add support for FRAG_RESULT_STENCIL.
this is needed to add support for stencil shader export.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-13 09:30:03 +10:00
Dave Airlie d02993c9dc gallium/util: add S8 tile sampling support. 2010-10-13 09:30:03 +10:00
Dave Airlie 67e71429f1 gallium/format: add X32_S8X24_USCALED format.
Has similiar use cases to the S8X24 and X24S8 formats.
2010-10-13 09:30:03 +10:00
Dave Airlie 66a0d1e4b9 gallium/format: add support for X24S8 and S8X24 formats.
these formats are needed for hw that can sample and write stencil values.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-13 09:30:03 +10:00
Dave Airlie 4ecb2c105d gallium/tgsi: add support for stencil writes.
this adds the capability + a stencil semantic id, + tgsi scan support.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-13 09:30:02 +10:00
Eric Anholt 43873b53c4 i965: Don't rebase the index buffer to min 0 if any arrays are in VBOs.
There was a check to only do the rebase if we didn't have everything
in VBOs, but nexuiz apparently hands us a mix of VBOs and arrays,
resulting in blocking on the GPU to do a rebase.

Improves nexuiz 800x600, high-settings performance on my Ironlake 41%
(+/- 1.3%), from 14.0fps to 19.7fps.
2010-10-12 15:17:35 -07:00
Eric Anholt 3316a54205 intel: Allow CopyTexSubImage to InternalFormat 3/4 textures, like RGB/RGBA.
The format selection of the CopyTexSubImage is pretty bogus still, but
this at least avoids software fallbacks in nexuiz, bringing
performance from 7.5fps to 12.8fps on my machine.
2010-10-12 14:08:00 -07:00
Eric Anholt 080e7aface i965: Fix missing "break;" in i2b/f2b, and missing AND of CMP result.
Fixes glsl-fs-i2b.
2010-10-12 13:07:40 -07:00
Ian Romanick 9fea9e5e21 glsl: Fix incorrect assertion
This assertion was added in commit f1c1ee11, but it did not notice
that the array is accessed with 'size-1' instead of 'size'.  As a
result, the assertion was off by one.  This caused failures in at
least glsl-orangebook-ch06-bump.
2010-10-12 12:50:29 -07:00
Ian Romanick b2b9b22c10 mesa: Validate assembly shaders when GLSL shaders are used
If an GLSL shader is used that does not provide all stages and
assembly shaders are provided for the missing stages, validate the
assembly shaders.

Fixes bugzilla #30787 and piglit tests glsl-invalid-asm0[12].

NOTE: this is a candidate for the 7.9 branch.
2010-10-12 10:54:28 -07:00
Keith Whitwell 7533c37457 llvmpipe: make sure intrinsics code is guarded with PIPE_ARCH_SSE 2010-10-12 18:28:12 +01:00
Thomas Hellstrom 893620e52e st/xorg: Fix typo
Pointed out by Jakob Bornecrantz.

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 18:26:05 +02:00
Brian Paul f1c1ee11d3 ir_to_mesa: assorted clean-ups, const qualifiers, new comments 2010-10-12 09:26:54 -06:00
José Fonseca 6fbd4faf97 gallivm: Name anonymous union. 2010-10-12 16:08:09 +01:00
Brian Paul 0ad9d8b538 st/xlib: add some comments 2010-10-12 08:54:54 -06:00
Brian Paul 3633e1f538 glsl2: fix signed/unsigned comparison warning 2010-10-12 08:54:16 -06:00
José Fonseca e3ec0fdd54 llmvpipe: improve mm_mullo_epi32
Apply Jose's suggestions for a small but measurable improvement in
isosurf.
2010-10-12 14:17:21 +01:00
Thomas Hellstrom b6b7ce84e5 st/xorg: Don't try to remove invalid fbs
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 15:09:05 +02:00
Thomas Hellstrom 201c3d3669 xorg/vmwgfx: Don't hide HW cursors when updating them
Gets rid of annoying cursor flicker

Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 15:09:05 +02:00
Thomas Hellstrom bfd065c71e st/xorg: Add a customizer option to get rid of annoying cursor update flicker
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 15:09:05 +02:00
Thomas Hellstrom f0bbf130f9 xorg/vmwgfx: Make vmwarectrl work also on 64-bit servers
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 15:09:04 +02:00
Thomas Hellstrom ec08047a80 st/xorg: Don't try to use option values before processing options
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 15:09:04 +02:00
Keith Whitwell 0ca0382d1b Revert "llvmpipe: try to keep plane c values small"
This reverts commit 9773722c2b.

Looks like there are some floor/rounding issues here that need
to be better understood.
2010-10-12 13:20:39 +01:00
Keith Whitwell 22ec25e2bf gallivm: don't branch on KILLs near end of shader 2010-10-12 13:14:51 +01:00
Keith Whitwell d0eb854f58 r600g: add missing file to sconscript 2010-10-12 13:08:34 +01:00
Keith Whitwell 1a574afabc gallium: move sse intrinsics debug helpers to u_sse.h 2010-10-12 13:02:28 +01:00
José Fonseca 39331be44e llvmpipe: Fix MSVC build.
MSVC doesn't accept more than 3 __m128i arguments.
2010-10-12 12:27:55 +01:00
Keith Whitwell b4277bc584 llvmpipe: fix typo in last commit 2010-10-12 11:52:39 +01:00
Keith Whitwell 9773722c2b llvmpipe: try to keep plane c values small
Avoid accumulating more and more fixed point bits.
2010-10-12 11:50:14 +01:00
Keith Whitwell 9d59e148f8 llvmpipe: add debug helpers for epi32 etc 2010-10-12 11:50:13 +01:00
Keith Whitwell 2cf98d5a6d llvmpipe: try to do more of rast_tri_3_16 with intrinsics
There was actually a large quantity of scalar code in these functions
previously.  This tries to move more into intrinsics.

Introduce an sse2 mm_mullo_epi32 replacement to avoid sse4 dependency
in the new rasterization code.
2010-10-12 11:50:07 +01:00
José Fonseca 4cb3b4ced8 llvmpipe: Do not dispose the execution engine.
The engine is a global owned by gallivm module.
2010-10-12 08:36:51 +01:00
Francisco Jerez c25fcf5aa5 nouveau: Get larger push buffers.
Useful to amortize the command submission/reloc overhead (e.g. etracer
goes from 72 to 109 FPS on nv4b).
2010-10-12 04:13:22 +02:00
Francisco Jerez 70828aa246 dri/nouveau: Initialize tile_flags when allocating a render target. 2010-10-12 04:12:56 +02:00
Dave Airlie 965f69cb0c r600g: fix typo in vertex sampling on r600
fixes https://bugs.freedesktop.org/show_bug.cgi?id=30771

Reported-by: Kevin DeKorte
2010-10-12 09:45:22 +10:00
Eric Anholt bcec03d527 i965: Always use the new FS backend on gen6.
It's now much more correct for gen6 than the old backend, with just 2
regressions I've found (one of which is common with pre-gen6 and will
be fixed by an array splitting IR pass).

This does leave the old Mesa IR backend getting used still when we
don't have GLSL IR, but the plan is to get GLSL IR input to the driver
for the ARB programs and fixed function by the next release.
2010-10-11 15:32:41 -07:00
Eric Anholt 0cadd32b6d i965: Fix gen6 pixel_[xy] setup to avoid mixing int and float src operands.
Pre-gen6, you could mix int and float just fine.  Now, you get goofy
results.

Fixes:
glsl-arb-fragment-coord-conventions
glsl-fs-fragcoord
glsl-fs-if-greater
glsl-fs-if-greater-equal
glsl-fs-if-less
glsl-fs-if-less-equal
2010-10-11 15:26:59 -07:00
Eric Anholt 17306c60ad i965: Don't compute-to-MRF in gen6 VS math.
There was code to do this for pre-gen6 already, this just enables it
for gen6 as well.
2010-10-11 15:26:59 -07:00
Eric Anholt 720ed3c906 i965: Expand uniform args to gen6 math to full registers to get hstride == 1.
This is a hw requirement in math args.  This also is inefficient, as
we're calculating the same result 8 times, but then we've been doing
that on pre-gen6 as well.  If we're doing math on uniforms, though,
we'd probably be better served by having some sort of mechanism for
precalculating those results into another uniform value to use.

Fixes 7 piglit math tests.
2010-10-11 15:26:58 -07:00
Eric Anholt 317dbf4613 i965: Don't compute-to-MRF in gen6 math instructions. 2010-10-11 15:26:58 -07:00
Eric Anholt 7b5bc38c44 i965: Add a couple of checks for gen6 math instruction limits. 2010-10-11 15:26:58 -07:00
Eric Anholt 25cf241540 i965: Don't consider gen6 math instructions to write to MRFs.
This was leftover from the pre-gen6 cleanups.  One tests regresses
where compute-to-MRF now occurs.
2010-10-11 15:26:58 -07:00
Chad Versace 41c2079855 glsl: Changes in generated file glsl_lexer.cpp
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2010-10-11 14:25:53 -07:00
Chad Versace 0c9fef6111 glsl: Add lexer rules for uint and uvecN (N=2..4)
Commit for generated file glsl_lexer.cpp follows this commit.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2010-10-11 14:25:48 -07:00
Chad Versace fc99a3beb9 glsl: Add glsl_type::uvecN_type for N=2,3
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2010-10-11 14:25:44 -07:00
Chad Versace a34817917b intel_extensions: Add ability to set GLSL version via environment
Add ability to set the GLSL version used by the GLcontext by setting the
environment variable INTEL_GLSL_VERSION. For example,
    env INTEL_GLSL_VERSION=130 prog args
If the environment variable is missing, the GLSL versions defaults to 120.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2010-10-11 14:25:30 -07:00
Daniel Vetter 603741a86d r200: revalidate after radeon_update_renderbuffers
By calling radeon_draw_buffers (which sets the necessary flags
in radeon->NewGLState) and revalidating if NewGLState is non-zero
in r200TclPrimitive. This fixes an assert in libdrm (the color-/
depthbuffer was changed but not yet validated) and and stops the
kernel cs checker from complaining about them (when they're too
small).

Thanks to Mario Kleiner for the hint to call radeon_draw_buffer
(instead of my half-broken hack).

v2: Also fix the swtcl r200 path.

Cc: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-10-11 15:16:44 -04:00
Eric Anholt c6dbf253d2 i965: Compute to MRF in the new FS backend.
This didn't produce a statistically significant performance difference
in my demo (n=4) or nexuiz (n=3), but it still seems like a good idea
and is recommended by the HW team.
2010-10-11 12:08:13 -07:00
Eric Anholt 06fd639c51 i965: Give the FB write and texture opcodes the info on base MRF, like math. 2010-10-11 12:07:33 -07:00
Eric Anholt 0cd6cea8a3 i965: Give the math opcodes information on base mrf/mrf len.
This is progress towards enabling a compute-to-MRF pass.
2010-10-11 12:03:34 -07:00
Eric Anholt 37758fb1cb i965: Move FS backend structures to a header.
It's time to start splitting some of this up.
2010-10-11 11:52:02 -07:00
Eric Anholt 251fe27854 i965: Reduce register interference checks for changed FS_OPCODE_DISCARD.
While I don't know of any performance changes from this (once extra
reg available out of 128), it makes the generated asm a lot cleaner
looking.
2010-10-11 11:52:01 -07:00
Eric Anholt 90c4022040 i965: Split FS_OPCODE_DISCARD into two steps.
Having the single opcode write then read the reg meant that single
instruction opcodes had to consider their source regs to interfere
with their dest regs.
2010-10-11 11:52:01 -07:00
José Fonseca 986cb9d5cf llvmpipe: Use lp_tgsi_info. 2010-10-11 13:06:25 +01:00
José Fonseca 7c1b5772a8 gallivm: More detailed analysis of tgsi shaders.
To allow more optimizations, in particular for direct textures.
2010-10-11 13:05:32 +01:00
José Fonseca 11dad21718 tgsi: Export some names for some tgsi enums.
Useful to give human legible names in other cases.
2010-10-11 13:05:31 +01:00
José Fonseca 6c1aa4fd49 gallium: Define C99 restrict keyword where absent. 2010-10-11 13:05:31 +01:00
José Fonseca e1003336f0 gallivm: Eliminate unsigned integer arithmetic from texture coordinates.
SSE support for 32bit and 16bit unsigned arithmetic is not complete, and
can easily result in inefficient code.

In most cases signed/unsigned doesn't make a difference, such as for
integer texture coordinates.

So remove uint_coord_type and uint_coord_bld to avoid inefficient
operations to sneak in the future.
2010-10-11 08:14:09 +01:00
José Fonseca b18fecbd0e llvmpipe: Remove outdated comment about stencil testing. 2010-10-11 08:14:09 +01:00
Dave Airlie 3322416de4 r600g: don't run with scissors.
This could probably be done much nicer, I've spent a day chasing
a coherency problem in the kernel, that turned out to be incorrect
scissor setup.
2010-10-11 16:23:23 +10:00
Dave Airlie ef2702fb20 r600g: add TXL opcode support.
fixes glsl1-2D Texture lookup with explicit lod (Vertex shader)
2010-10-11 12:18:05 +10:00
Dave Airlie ea1d818b58 r600g: enable vertex samplers.
We need to move the texture sampler resources out of the range of the vertex attribs.

We could probably improve this using an allocator but this is the simple answer for now.

makes mesa-demos/src/glsl/vert-tex work.
2010-10-11 11:59:53 +10:00
Dave Airlie 2c47f302af r600g: evergreen has no request size bit in texture word4 2010-10-11 11:59:53 +10:00
Dave Airlie bd89da79a1 r600g: fix input/output Z export mixup for evergreen. 2010-10-11 11:59:53 +10:00
José Fonseca 17dbd41cf2 gallivm: Pass texture coords derivates as scalars.
We end up treating them as scalars in the end, and it saves some
instructions.
2010-10-10 19:51:35 +01:00
José Fonseca 693667bf88 gallivm: Use variables instead of Phis in loops.
With this commit all explicit Phi emission is now gone.
2010-10-10 19:05:05 +01:00
José Fonseca 48003f3567 gallivm: Allow to disable bri-linear filtering with GALLIVM_DEBUG=no_brilinear runtime option 2010-10-10 18:48:02 +01:00
José Fonseca 124adf253c gallivm: Fix a long standing bug with nested if-then-else emission.
We can't patch true-block at end-if time, as there is no guarantee that
the block at the beginning of the true stanza is the same at the end of
the true stanza -- other control flow elements may have been emitted half
way the true stanza.

Although this bug surfaced recently with the commit to skip mip filtering
when lod is an integer the bug was always there, although probably it
was avoided until now: e.g., cubemap selection nests if-then-else on the
else stanza, which does not suffer from the same problem.
2010-10-10 18:48:02 +01:00
delphi 08f890d4c3 draw: some changes to allow for runtime changes to userclip planes 2010-10-10 08:40:11 +01:00
Francisco Jerez e2acc7be26 dri/nv10: Fake fast Z clears for pre-nv17 cards. 2010-10-10 04:14:34 +02:00
Francisco Jerez 35a1893fd1 dri/nouveau: Minor cleanup. 2010-10-10 01:48:01 +02:00
José Fonseca 307df6a858 gallivm: Cleanup the rest of the flow module. 2010-10-09 21:39:14 +01:00
José Fonseca d0ea464159 gallivm: Simplify if/then/else implementation.
No need for for a flow stack anymore.
2010-10-09 21:14:05 +01:00
José Fonseca 1949f8c315 gallivm: Factor out the SI->FP texture size conversion for SoA path too 2010-10-09 20:26:11 +01:00
José Fonseca d45c379027 gallivm: Remove support for Phi generation.
Simply rely on mem2reg pass. It's easier and more reliable.
2010-10-09 20:14:03 +01:00
José Fonseca ea7b49028b gallivm: Use varilables instead of Phis for cubemap selection. 2010-10-09 19:53:21 +01:00
José Fonseca cc40abad51 gallivm: Don't generate Phis for execution mask. 2010-10-09 12:55:31 +01:00
José Fonseca 679dd26623 gallivm: Special bri-linear computation path for unmodified rho. 2010-10-09 12:13:00 +01:00
José Fonseca 81a09c8a97 gallivm: Less code duplication in log computation. 2010-10-09 12:12:59 +01:00
José Fonseca 52427f0ba7 util: Defined M_SQRT2 when not available. 2010-10-09 12:12:59 +01:00
José Fonseca 53d7f5e107 gallivm: Handle code have ret correctly.
Stop disassembling on unconditional backwards jumps.
2010-10-09 12:12:59 +01:00
José Fonseca edba53024f llvmpipe: Fix MSVC build. Enable the new SSE2 code on non SSE3 systems. 2010-10-09 12:12:58 +01:00
Keith Whitwell 2de720dc8f llvmpipe: simplified SSE2 swz/unswz routines
We've been using these in the linear path for a while now.  Based on
Chris's SSSE3 code, but using only sse2 opcodes.  Speed seems to be
identical, but code is simpler & removes dependency on SSE3.

Should be easier to extend to other rgba8 formats.
2010-10-09 12:12:58 +01:00
Keith Whitwell 5b7eb868fd llvmpipe: clean up shader pre/postamble, try to catch more early-z
Specifically, can do early-depth-test even when alpahtest or
kill-pixel are active, providing we defer the actual z write until the
final mask is avaialable.

Improves demos/fire.c especially in the case where you get close to
the trees.
2010-10-09 11:44:45 +01:00
Keith Whitwell aa4cb5e2d8 llvmpipe: try to be sensible about whether to branch after mask updates
Don't branch more than once in quick succession.  Don't branch at the
end of the shader.
2010-10-09 11:44:45 +01:00
Keith Whitwell 2ef6f75ab4 gallivm: simpler uint8->float conversions
LLVM seems to finds it easier to reason about these than our
mantissa-manipulation code.
2010-10-09 11:44:45 +01:00
Keith Whitwell c79f162367 gallivm: prefer blendvb for integer arguments 2010-10-09 11:44:45 +01:00
Keith Whitwell d2cf757f44 gallivm: specialized x8z24 depthtest path
Avoid unnecessary masking of non-existant stencil component.
2010-10-09 11:44:09 +01:00
Keith Whitwell 954965366f llvmpipe: dump fragment shader ir and asm when LP_DEBUG=fs
Better than GALLIVM_DEBUG if you're only interested in fragment shaders.
2010-10-09 11:43:23 +01:00
Keith Whitwell 6da29f3611 llvmpipe: store zero into all alloca'd values
Fixes slowdown in isosurf with earlier versions of llvm.
2010-10-09 11:43:23 +01:00
Keith Whitwell 40d7be5261 llvmpipe: use alloca for fs color outputs
Don't try to emit our own phi's, let llvm mem2reg do it for us.
2010-10-09 11:43:23 +01:00
Keith Whitwell 8009886b00 llvmpipe: defer attribute interpolation until after mask and ztest
Don't calculate 1/w for quads which aren't visible...
2010-10-09 11:42:48 +01:00
José Fonseca d0bfb3c514 llvmpipe: Prevent z > 1.0
The current interpolation schemes causes precision loss.

Changing the operation order helps, but does not completely avoid the
problem.

The only short term solution is to clamp z to 1.0.

This is unfortunate, but probably unavoidable until interpolation is
improved.
2010-10-09 09:35:41 +01:00
José Fonseca 34c11c87e4 gallivm: Do size computations simultanously for all dimensions (AoS).
Operate simultanouesly on <width, height, depth> vector as much as possible,
instead of doing the operations on vectors with broadcasted scalars.

Also do the 24.8 fixed point scalar with integer shift of the texture size,
for unnormalized coordinates.

AoS path only for now -- the same thing can be done for SoA.
2010-10-09 09:34:31 +01:00
Zack Rusin 6316d54056 llvmpipe: fix rasterization of vertical lines on pixel boundaries 2010-10-09 08:19:21 +01:00
Vinson Lee e7843363a5 i965: Initialize member variables.
Fixes these GCC warnings.
brw_wm_fp.c: In function 'search_or_add_const4f':
brw_wm_fp.c:92: warning: 'reg.Index2' is used uninitialized in this function
brw_wm_fp.c:84: note: 'reg.Index2' was declared here
brw_wm_fp.c:92: warning: 'reg.RelAddr2' is used uninitialized in this function
brw_wm_fp.c:84: note: 'reg.RelAddr2' was declared here
2010-10-08 16:40:29 -07:00
Vinson Lee 5abd498c47 i965: Silence unused variable warning on non-debug builds.
Fixes this GCC warning.
brw_vs.c: In function 'do_vs_prog':
brw_vs.c:46: warning: unused variable 'ctx'
2010-10-08 16:30:59 -07:00
Vinson Lee 978ffa1d61 i965: Silence unused variable warning on non-debug builds.
Fixes this GCC warning.
brw_eu_emit.c: In function 'brw_math2':
brw_eu_emit.c:1189: warning: unused variable 'intel'
2010-10-08 16:02:59 -07:00
Vinson Lee 220c0834a4 i915: Silence unused variable warning in non-debug builds.
Fixes this GCC warning.
i915_vtbl.c: In function 'i915_assert_not_dirty':
i915_vtbl.c:670: warning: unused variable 'dirty'
2010-10-08 15:49:02 -07:00
Roland Scheidegger ff72c79924 gallivm: make use of new iround code in lp_bld_conv.
Only requires sse2 now.
2010-10-09 00:36:38 +02:00
Roland Scheidegger 175cdfd491 gallivm: optimize soa linear clamp to edge wrap mode a bit
Clamp against 0 instead of -0.5, which simplifies things.
The former version would have resulted in both int coords being zero
(in case of coord being smaller than 0) and some "unused" weight value,
whereas now the int coords will be 0 and 1, but weight will be 0, hence the
lerp should produce the same value.
Still not happy about differences between normalized and non-normalized...
2010-10-09 00:36:38 +02:00
Roland Scheidegger 2cc6da85d6 gallivm: avoid unnecessary URem in linear wrap repeat case
Haven't looked at what code this exactly generates but URem can't be fast.
Instead of using two URem only use one and replace the second one with
select/add (this is what the corresponding aos code already does).
2010-10-09 00:36:38 +02:00
Roland Scheidegger 318bb080b0 gallivm: more linear tex wrap mode calculation simplification
Rearrange order of operations a bit to make some clamps easier.
All calculations should be equivalent.
Note there seems to be some inconsistency in the clamp to edge case
wrt normalized/non-normalized coords, could potentially simplify this too.
2010-10-09 00:36:38 +02:00
Roland Scheidegger 99ade19e6e gallivm: optimize some tex wrap mode calculations a bit
Sometimes coords are clamped to positive numbers before doing conversion
to int, or clamped to 0 afterwards, in this case can use itrunc
instead of ifloor which is easier. This is only the case for nearest
calculations unfortunately, except linear MIRROR_CLAMP_TO_EDGE which
for the same reason can use a unsigned float build context so the
ifloor_fract helper can reduce this to itrunc in the ifloor helper itself.
2010-10-09 00:36:38 +02:00
Roland Scheidegger 1e17e0c4ff gallivm: replace sub/floor/ifloor combo with ifloor_fract 2010-10-09 00:36:37 +02:00
Roland Scheidegger cb3af2b434 gallivm: faster iround implementation for sse2
sse2 supports round to nearest directly (or rather, assuming default nearest
rounding mode in MXCSR). Use intrinsic to use this rather than round (sse41)
or bit manipulation whenever possible.
2010-10-09 00:36:37 +02:00
Roland Scheidegger 0ed8c56bfe gallivm: fix trunc/itrunc comment
trunc of -1.5 is -1.0 not 1.0...
2010-10-09 00:36:37 +02:00
Vinson Lee 0f4984a0fb i915: Silence unused variable warning in non-debug builds.
Fixes this GCC warning.
i830_vtbl.c: In function 'i830_assert_not_dirty':
i830_vtbl.c:704: warning: unused variable 'i830'
2010-10-08 15:35:35 -07:00
Ian Romanick 0ea8b99332 glsl: Remove const decoration from inlined function parameters
The constness of the function parameter gets inlined with the rest of
the function.  However, there is also an assignment to the parameter.
If this occurs inside a loop the loop analysis code will get confused
by the assignment to a read-only variable.

Fixes bugzilla #30552.

NOTE: this is a candidate for the 7.9 branch.
2010-10-08 14:29:11 -07:00
Ian Romanick dc459f8756 intel: Enable GL_ARB_explicit_attrib_location 2010-10-08 14:21:23 -07:00
Ian Romanick dbc6c9672d main: Enable GL_ARB_explicit_attrib_location for swrast 2010-10-08 14:21:23 -07:00
Ian Romanick 68a4fc9d5a glsl: Add linker support for explicit attribute locations 2010-10-08 14:21:23 -07:00
Ian Romanick eee68d3631 glsl: Track explicit location in AST to IR translation 2010-10-08 14:21:23 -07:00
Ian Romanick 2b45ba8bce glsl: Regenerate files changes by previous commit 2010-10-08 14:21:23 -07:00
Ian Romanick 7f68cbdc4d glsl: Add parser support for GL_ARB_explicit_attrib_location layouts
Only layout(location=#) is supported.  Setting the index requires GLSL
1.30 and GL_ARB_blend_func_extended.
2010-10-08 14:21:22 -07:00
Ian Romanick eafebed5bd glcpp: Regenerate files changes by previous commit 2010-10-08 14:21:22 -07:00
Ian Romanick e0c9f67be5 glcpp: Add the define for ARB_explicit_attrib_location when present 2010-10-08 14:21:22 -07:00
Ian Romanick 5ed6610d11 glsl: Regenerate files modified by previous commits 2010-10-08 14:21:22 -07:00
Ian Romanick e24d35a5b5 glsl: Wrap ast_type_qualifier contents in a struct in a union
This will ease adding non-bit fields in the near future.
2010-10-08 14:21:22 -07:00
Ian Romanick 5ff4cfb788 glsl: Clear type_qualifier using memset 2010-10-08 14:21:22 -07:00
Ian Romanick fd2aa7d313 glsl: Slight refactor of error / warning checking for ARB_fcc layout 2010-10-08 14:21:22 -07:00
Ian Romanick dd93035a4d glsl: Refactor 'layout' grammar to match GLSL 1.60 spec grammar 2010-10-08 14:21:22 -07:00
Ian Romanick 4b5489dd6f glsl: Fail linking if assign_attribute_locations fails 2010-10-08 14:21:22 -07:00
Vinson Lee 3b16c591a4 r600g: Silence uninitialized variable warning. 2010-10-08 14:17:14 -07:00
Vinson Lee 36b65a373a r600g: Silence uninitialized variable warning. 2010-10-08 14:14:16 -07:00
Vinson Lee 131485efae r600g: Silence uninitialized variable warning. 2010-10-08 14:08:50 -07:00
Vinson Lee 5e90971475 gallivm: Remove unnecessary header. 2010-10-08 14:03:10 -07:00
Eric Anholt c52a0b5c7d i965: Add register coalescing to the new FS backend.
Improves performance of my GLSL demo 14.3% (+/- 4%, n=4) by
eliminating the moves used in ir_assignment and ir_swizzle handling.
Still 16.5% to go to catch up to the Mesa IR backend, presumably
because instructions are almost perfectly mis-scheduled now.
2010-10-08 13:22:27 -07:00
Eric Anholt 80c0077a6f i965: Enable attribute swizzling (repositioning) in the gen6 SF.
We were trying to remap a fully-filled array down to only handing the
WM the components it uses.  This is called attribute swizzling, and if
you don't enable it you just get 1:1 mappings of inputs to outputs.

This almost fixes glsl-routing, except for the highest gl_TexCoord[]
indices.
2010-10-08 12:00:04 -07:00
Eric Anholt cac04a9397 i965: Fix new FS gen6 interpolation for sparsely-populated arrays.
We'd overwrite the same element twice.
2010-10-08 11:59:19 -07:00
Eric Anholt 624ce6f61b i965: Fix gen6 WM push constants updates.
We would compute a new buffer, but never point the hardware at the new
buffer.  This partially fixes glsl-routing, as now it get the updated
uniform for which attribute to draw.
2010-10-08 11:59:19 -07:00
José Fonseca 3fde8167a5 gallivm: Help for combined extraction and broadcasting.
Doesn't change generated code quality, but saves some typing.
2010-10-08 19:48:16 +01:00
José Fonseca 438390418d llvmpipe: First minify the texture size, then broadcast. 2010-10-08 19:11:52 +01:00
José Fonseca f5b5fb32d3 gallivm: Move into the as much of the second level code as possible.
Also, pass more stuff trhough the sample build context, instead of
arguments.
2010-10-08 19:11:52 +01:00
Eric Anholt 5b24d69fcd i965: Handle swizzles in the addition of YUV texture constants.
If someone happened to land a set in a different swizzle order, we
would have assertion failed.
2010-10-08 10:24:30 -07:00
Eric Anholt 0534e958c9 i965: Drop the check for YUV constants in the param list.
_mesa_add_unnamed_constant() already does that.
2010-10-08 10:24:29 -07:00
Eric Anholt fa8aba9da4 i965: Drop the check for duplicate _mesa_add_state_reference.
_mesa_add_state_reference does that check for us anyway.
2010-10-08 10:24:29 -07:00
Eric Anholt e310c22bb7 mesa: Simplify a bit of _mesa_add_state_reference using memcmp. 2010-10-08 10:24:29 -07:00
José Fonseca 6b0c79e058 gallivm: Warn when doing inefficient integer comparisons. 2010-10-08 17:43:15 +01:00
José Fonseca d5ef59d8b0 gallivm: Avoid control flow for two-sided stencil test. 2010-10-08 17:43:15 +01:00
Keith Whitwell ef3407672e llvmpipe: fix off-by-one in tri_16 2010-10-08 17:30:08 +01:00
Keith Whitwell 0ff132e5a6 llvmpipe: add rast_tri_4_16 for small lines and points 2010-10-08 17:30:08 +01:00
Keith Whitwell eeb13e2352 llvmpipe: clean up setup_tri a little 2010-10-08 17:30:08 +01:00
Keith Whitwell e191bf4a85 gallivm: round rather than truncate in new 4x4f->1x16ub conversion path 2010-10-08 17:30:08 +01:00
José Fonseca f91b4266c6 gallivm: Use the wrappers for SSE pack intrinsics.
Fixes assertion failures on LLVM 2.6.
2010-10-08 17:30:08 +01:00
Keith Whitwell 607e3c542c gallivm: special case conversion 4x4f to 1x16ub
Nice reduction in the number of operations required for final color
output in many shaders.
2010-10-08 17:30:08 +01:00
Keith Whitwell 29d6a1483d llvmpipe: avoid overflow in triangle culling
Avoid multiplying fixed-point values.  Calculate triangle area in
floating point use that for culling.

Lift area calculations up a level as we are already doing this in the
triangle_both() case.

Would like to share the calculated area with attribute interpolation,
but the way the code is structured makes this difficult.
2010-10-08 17:30:08 +01:00
Keith Whitwell ad6730fadb llvmpipe: fail gracefully on oom in scene creation 2010-10-08 17:26:29 +01:00
José Fonseca eb605701aa gallivm: Implement brilinear filtering. 2010-10-08 15:50:28 +01:00
José Fonseca c8179ef5e8 gallivm: Fix copy'n'paste typo in previous commit. 2010-10-08 14:09:22 +01:00
José Fonseca df7a2451b1 gallivm: Clamp mipmap level and zero mip weight simultaneously. 2010-10-08 14:06:38 +01:00
José Fonseca 0d84b64a4f gallivm: Use lp_build_ifloor_fract for lod computation.
Forgot this one before.
2010-10-08 14:06:38 +01:00
José Fonseca 4f2e2ca4e3 gallivm: Don't compute the second mipmap level when frac(lod) == 0 2010-10-08 14:06:37 +01:00
José Fonseca 05fe33b71c gallivm: Simplify lp_build_mipmap_level_sizes' interface. 2010-10-08 14:06:37 +01:00
José Fonseca 4eb222a3e6 gallivm: Do not do mipfiltering when magnifying.
If lod < 0, then invariably follows that ilevel0 == ilevel1 == 0.
2010-10-08 14:06:37 +01:00
Vinson Lee 1f01f5cfcf r600g: Remove unnecessary header. 2010-10-08 04:56:49 -07:00
Dave Airlie 8d6a38d7b3 r600g: drop width/height per level storage.
these aren't used anywhere, so just waste memory.
2010-10-08 19:55:05 +10:00
Eric Anholt bbb840049e i965: Normalize cubemap coordinates like is done in the Mesa IR path.
Fixes glsl-fs-texturecube-2-*
2010-10-07 16:41:13 -07:00
Eric Anholt 4d202da7a4 i965: Disable emitting if () statements on gen6 until we really fix them. 2010-10-07 16:41:13 -07:00
Dave Airlie 1ae5cc2e67 r600g: add some RG texture format support. 2010-10-08 09:37:02 +10:00
Kristian Høgsberg 1d595c7cd4 gles2: Add GL_EXT_texture_format_BGRA8888 support 2010-10-07 17:08:50 -04:00
José Fonseca 321ec1a224 gallivm: Vectorize the rho computation. 2010-10-07 22:08:42 +01:00
Dave Airlie 51f9cc4759 r600g: fix Z export enable bits.
we should be checking output array not input to decide.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-07 15:32:05 +10:00
Dave Airlie 97eea87bde r600g: use format from the sampler view not from the texture.
we want to use the format from the sampler view which isn't always the
same as the texture format when creating sampler views.
2010-10-07 15:17:28 +10:00
Andre Maasikas 84457701b0 r600g: fix evergreen interpolation setup
interp data is stored in gpr0 so first interp overwrote it
and subsequent ones got wrong values

reserve register 0 so it's not used for attribs.
alternative is to interpolate attrib0 last (reverse, as r600c does)
2010-10-07 07:51:32 +03:00
Chia-I Wu b2c0ef8b51 st/vega: Fix version check in context creation.
This fixes a regression since 4531356817.
2010-10-07 12:15:31 +08:00
Chia-I Wu da495ee870 targets/egl: Fix linking with libdrm. 2010-10-07 12:06:59 +08:00
Eric Anholt d3163912c1 i965: Fix gen6 pointsize handling to match pre-gen6.
Fixes point-line-no-cull.
Bug #30532
2010-10-06 17:29:29 -07:00
Eric Anholt b380531fd4 i965: Don't assume that WPOS is always provided on gen6 in the new FS.
We sensibly only provide it if the FS asks for it.  We could actually
skip WPOS unless the FS needed WPOS.zw, but that's something for
later.

Fixes: glsl-texture2d and probably many others.
2010-10-06 12:13:08 -07:00
Eric Anholt 1fdc8c007e i965: Add support for gl_FrontFacing on gen6.
Fixes glsl1-gl_FrontFacing var (2) with new FS.
2010-10-06 12:13:08 -07:00
Eric Anholt a760b5b509 i965: Refactor gl_FrontFacing setup out of general variable setup. 2010-10-06 12:13:08 -07:00
Eric Anholt 75270f705f i965: Gen6's sampler messages are the same as Ironlake.
This should fix texturing in the new FS backend.
2010-10-06 12:13:08 -07:00
Eric Anholt fe6efc25ed i965: Don't do 1/w multiplication in new FS for gen6
Not needed now that we're doing barycentric.
2010-10-06 12:13:08 -07:00
Eric Anholt 5d99b01501 i965: Add some clarification of the WECtrl field. 2010-10-06 12:13:08 -07:00
Eric Anholt 5eeaf3671e i965: Fix botch in the header_present case in the new FS.
I only set it on the color_regions == 0 case, missing the important
case, causing GPU hangs on pre-gen6.
2010-10-06 12:13:08 -07:00
José Fonseca 9fe510ef35 llvmpipe: Cleanup depth-stencil clears.
Only cosmetic changes. No actual practical difference.
2010-10-06 19:08:21 +01:00
José Fonseca 33f88b3492 util: Cleanup util_pack_z_stencil and friends.
- Handle PIPE_FORMAT_Z32_FLOAT packing correctly.

- In the integer version z shouldn't be passed as as double.

- Make it clear that the integer versions should only be used for masks.

- Make integer type sizes explicit (uint32_t for now, although
  uint64_t will be necessary later to encode f32_s8_x24).
2010-10-06 19:08:18 +01:00
José Fonseca 87dd859b34 gallivm: Compute lod as integer whenever possible.
More accurate/faster results for PIPE_TEX_MIPFILTER_NEAREST. Less
FP <-> SI conversion overall.
2010-10-06 18:51:25 +01:00
José Fonseca 1c32583581 gallivm: Only apply min/max_lod when necessary. 2010-10-06 18:50:57 +01:00
Keith Whitwell 5849a6ab64 gallivm: don't apply zero lod_bias 2010-10-06 18:49:32 +01:00
José Fonseca af05f61576 gallivm: Combined ifloor & fract helper.
The only way to ensure we don't do redundant FP <-> SI conversions.
2010-10-06 18:47:01 +01:00
José Fonseca 012d57737b gallivm: Fast implementation of iround(log2(x))
Not tested yet, but should be correct.
2010-10-06 18:46:59 +01:00
José Fonseca 4648846bd6 gallivm: Use a faster (and less accurate) log2 in lod computation. 2010-10-06 18:46:29 +01:00
José Fonseca df3505b193 gallivm: Take the type signedness in consideration in round/ceil/floor. 2010-10-06 18:46:08 +01:00
Eric Anholt feca660939 i965: Fix up IF/ELSE/ENDIF for gen6.
The jump delta is now in the part of the instruction where the
destination fields used to be, and the src args are ignored (or not,
for the new non-predicated IF that we don't use yet).
2010-10-06 10:09:45 -07:00
Eric Anholt f7cb28fad9 i965: Gen6 no longer has the IFF instruction; always use IF. 2010-10-06 10:09:45 -07:00
Eric Anholt 3c97c00e38 i965: Add back gen6 headerless FB writes to the new FS backend.
It's not that hard to detect when we need the header.
2010-10-06 10:09:44 -07:00
Jerome Glisse 3fabd218a0 r600g: fix dirty state handling
Avoid having object ending up in dead list of dirty object.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-06 13:01:31 -04:00
Eric Anholt 634abbf7b2 i965: Also do constant propagation for the second operand of CMP.
We could do the first operand as well by flipping the comparison, but
this covered several CMPs in code I was looking at.
2010-10-06 09:33:26 -07:00
Eric Anholt dcd0261aff i965: Enable the constant propagation code.
A debug disable had slipped in.
2010-10-06 09:33:26 -07:00
Jerome Glisse 1644bb0f40 r600g: avoid segfault due to unintialized list pointer
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-06 09:41:19 -04:00
José Fonseca 06472ad7e8 llvmpipe: Fix sprite coord perspective interpolation of Q.
Q coordinate's coefficients also need to be multiplied by w, otherwise
it will have 1/w, causing problems with TXP.
2010-10-06 11:46:41 +01:00
José Fonseca e74955eba3 llvmpipe: Fix perspective interpolation for point sprites.
Once a fragment is generated with LP_INTERP_PERSPECTIVE set for an input,
it will do a divide by w for that input. Therefore it's not OK to treat LP_INTERP_PERSPECTIVE as
LP_INTERP_LINEAR or vice-versa, even if the attribute is known to not
vary.

A better strategy would be to take the primitive in consideration when
generating the fragment shader key, and therefore avoid the per-fragment
perspective divide.
2010-10-06 11:44:59 +01:00
José Fonseca 446dbb9217 llvmpipe: Dump a few missing shader key flags. 2010-10-06 11:41:08 +01:00
Keith Whitwell 591e1bc34f llvmpipe: make debug_fs_variant respect variant->nr_samplers 2010-10-06 11:40:30 +01:00
José Fonseca 5661e51c01 retrace: Handle clear_render_target and clear_depth_stencil. 2010-10-06 11:37:49 +01:00
Dave Airlie 9528fc2107 r600g: add evergreen stencil support.
this sets the stencil up for evergreen properly.
2010-10-06 09:21:16 +10:00
Jerome Glisse ea5a74fb58 r600g: userspace fence to avoid kernel call for testing bo busy status
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-05 17:04:25 -04:00
Brian Paul 3d6eec0a87 st/mesa: replace assertion w/ conditional in framebuffer invalidation
https://bugs.freedesktop.org/show_bug.cgi?id=30632

NOTE: this is a candidate for the 7.9 branch.
2010-10-05 14:33:17 -06:00
Jerome Glisse 2cf3199ee3 r600g: simplify block relocation
Since flush rework there could be only one relocation per
register in a block.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-05 15:23:07 -04:00
Bas Nieuwenhuizen ac8a1ebe55 r600g: use dirty list to track dirty blocks
Got a speed up by tracking the dirty blocks in a seperate list instead of looping through all blocks. This version should work with block that get their dirty state disabled again and I added a dirty check during the flush as some blocks were already dirty.
2010-10-05 15:16:06 -04:00
Nicolas Kaiser 71fd35d1ad nv50: fix always true conditional in shader optimization 2010-10-05 18:53:15 +02:00
Jerome Glisse 585e4098aa r600g: improve bo flushing
Flush read cache before writting register. Track flushing inside
of a same cs and avoid reflushing same bo if not necessary. Allmost
properly force flush if bo rendered too and then use as a texture
in same cs (missing pipeline flush dunno if it's needed or not).

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-05 10:43:23 -04:00
Jerome Glisse 12d16e5f14 r600g: store reloc information in bo structure
Allow fast lookup of relocation information & id which
was a CPU time consumming operation.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-05 10:42:56 -04:00
Dave Airlie bf21b7006c pb: fix numDelayed accounting
we weren't decreasing when removing from the list.
2010-10-05 19:08:41 +10:00
Dave Airlie 12be1568d0 r600g: avoid unneeded bo wait
if we know the bo has gone not busy, no need to add another bo wait

thanks to Andre (taiu) on irc for pointing this out.
2010-10-05 16:00:48 +10:00
Dave Airlie d2c06b5037 r600g: drop use_mem_constant.
since we plan on using dx10 constant buffers everywhere.
2010-10-05 16:00:23 +10:00
Dave Airlie 46997d4fc2 r600g: drop mman allocator
we don't use this since constant buffers are now being used on all gpus.
2010-10-05 15:57:57 +10:00
Dave Airlie 05813ad5f4 r600g: add bo busy backoff.
When we go to do a lot of bos in one draw like constant bufs we need
to avoid bouncing off the busy ioctl, this mitigates by backing off
on busy bos for a short amount of times.
2010-10-05 15:51:38 +10:00
Dave Airlie 49866c8f34 pb: don't keep checking buffers after first busy
If we assume busy buffers are added to the list in order its unlikely
we'd fine one after the first busy one that isn't busy.
2010-10-05 15:50:58 +10:00
Dave Airlie 3c38e4f138 r600g: add bo fenced list.
this just keeps a list of bos submitted together, and uses them to decide
bo busy state for the whole group.
2010-10-05 15:35:52 +10:00
Brian Paul fb5e6f88fc swrast: fix choose_depth_texture_level() to respect mipmap filtering state
NOTE: this is a candidate for the 7.9 branch.
2010-10-04 19:59:46 -06:00
Marek Olšák d0408cf55d r300g: fix microtiling for 16-bits-per-channel formats
These texture formats (like R16G16B16A16_UNORM) were untested until now
because st/mesa doesn't use them. I am testing this with a hacked st/mesa
here.
2010-10-05 02:57:00 +02:00
Eric Anholt ea909be58d i965: Add support for gen6 FB writes to the new FS.
This uses message headers for now, since we'll need it for MRT.  We
can cut out the header later.
2010-10-04 16:08:17 -07:00
Eric Anholt 739aec39bd i965: In disasm, gen6 fb writes don't put msg reg # in destreg_conditionalmod.
It instead sensibly appears in the src0 slot.
2010-10-04 16:08:17 -07:00
Eric Anholt 3bf8774e9c i965: Add initial folding of constants into operand immediate slots.
We could try to detect this in expression handling and do it
proactively there, but it seems like less logic to do it in one
optional pass at the end.
2010-10-04 16:08:17 -07:00
Eric Anholt e27c88d8e6 i965: Add trivial dead code elimination in the new FS backend.
The glsl core should be handling most dead code issues for us, but we
generate some things in codegen that may not get used, like the 1/w
value or pixel deltas.  It seems a lot easier this way than trying to
work out up front whether we're going to use those values or not.
2010-10-04 16:08:17 -07:00
Eric Anholt 9faf64bc32 i965: Be more conservative on live interval calculation.
This also means that our intervals now highlight dead code.
2010-10-04 16:08:17 -07:00
Vinson Lee a0a8e24385 r600g: Fix SCons build. 2010-10-04 15:56:55 -07:00
Jerome Glisse b25c52201b r600g: remove dead label & fix indentation
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-04 17:25:19 -04:00
Jerome Glisse 243d6ea609 r600g: rename radeon_ws_bo to r600_bo
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-04 17:25:19 -04:00
Jerome Glisse 674452faf9 r600g: use r600_bo for relocation argument, simplify code
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-04 17:25:19 -04:00
Jerome Glisse d22a1247d8 r600g: allow r600_bo to be a sub allocation of a big bo
Add bo offset everywhere needed if r600_bo is ever a sub bo
of a bigger bo.

Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-04 17:25:19 -04:00
Jerome Glisse 294c9fce1b r600g: rename radeon_ws_bo to r600_bo
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-04 17:25:19 -04:00
delphi 25bb05fef0 draw: added userclip planes and updated variant_key 2010-10-04 22:08:16 +01:00
Krzysztof Smiechowicz 68c7994ab5 nvfx: Pair os_malloc_aligned() with os_free_aligned().
From AROS.
2010-10-04 11:43:29 -07:00
Dave Airlie 3d45d57044 r600g: TODO domain management
no wonder it was slow, the code is deliberately forcing stuff into GTT,
we used to have domain management but it seems to have disappeared.
2010-10-04 16:41:49 +10:00
Dave Airlie 1c2b3cb1e9 r600g: fix wwarning in bo_map function 2010-10-04 16:26:46 +10:00
Dave Airlie 6dc051557d r600g: the code to check whether a new vertex shader is needed was wrong
this code was memcmp'ing two structs, but refcounting one of them afterwards,
so any subsequent memcmp was never going to work.

again this stops unnecessary uploads of vertex program,
2010-10-04 16:24:59 +10:00
Dave Airlie 92aba9c1f5 r600g: break out of search for reloc bo after finding it.
this function was taking quite a lot of pointless CPU.
2010-10-04 15:58:39 +10:00
Eric Anholt 14bf92ba19 i965: Fix glean/texSwizzle regression in previous commit.
Easy enough patch, who needs a full test run.  Oh, that's right.  Me.
2010-10-03 00:24:09 -07:00
Eric Anholt a7fa00dfc5 i965: Set up swizzling of shadow compare results for GL_DEPTH_TEXTURE_MODE.
The brw_wm_surface_state.c handling of GL_DEPTH_TEXTURE_MODE doesn't
apply to shadow compares, which always return an intensity value.  The
texture swizzles can do the job for us.

Fixes:
glsl1-shadow2D(): 1
glsl1-shadow2D(): 3
2010-10-02 23:48:14 -07:00
Eric Anholt 4fb0c92c69 i965: Add support for EXT_texture_swizzle to the new FS backend. 2010-10-02 23:44:44 -07:00
Marek Olšák 8f7177e0de r300g: add support for L8A8 colorbuffers
Blending with DST_ALPHA is undefined. SRC_ALPHA works, though.
I bet some other formats have similar limitations too.
2010-10-02 23:19:38 +02:00
Marek Olšák e75bce026c r300g: add support for R8G8 colorbuffers
The hw swizzles have been obtained by a brute force approach,
and only C0 and C2 are stored in UV88, the other channels are
ignored.

R16G16 is going to be a lot trickier.
2010-10-02 21:42:22 +02:00
Dave Airlie 71a079fb4e mesa/st: initial attempt at RG support for gallium drivers
passes all piglit RG tests with softpipe.
2010-10-02 17:03:15 +10:00
Kenneth Graunke f317713432 i965: Fix incorrect batchbuffer size in gen6 clip state command.
FORCE_ZERO_RTAINDEX should be in the fourth (and final) dword.
2010-10-01 21:53:28 -07:00
Eric Anholt 64a9fc3fc1 i965: Don't try to emit code if we failed register allocation. 2010-10-01 17:19:04 -07:00
Eric Anholt 6397addd61 i965: Fix off-by-ones in handling the last members of register classes.
Luckily, one of them would result in failing out register allocation
when the other bugs were encountered.  Applies to
glsl-fs-vec4-indexing-temp-dst-in-nested-loop-combined, which still
fails register allocation, but now legitimately.
2010-10-01 17:19:04 -07:00
Eric Anholt afb64311e3 i965: Add a sanity check for register allocation sizes. 2010-10-01 17:19:03 -07:00
Eric Anholt 5ee0941316 i965: When producing a single channel swizzle, don't make a temporary.
This quickly cuts 8% of the instructions in my glsl demo.
2010-10-01 17:19:03 -07:00
Eric Anholt a0799725f5 i965: Restore the forcing of aligned pairs for delta_xy on chips with PLN.
By doing so using the register allocator now, we avoid wasting a
register to make the alignment happen.
2010-10-01 17:19:03 -07:00
Alex Deucher fb0eed84ca r600c: fix segfault in evergreen stencil code
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=30551
2010-10-01 20:14:25 -04:00
Vinson Lee 7af2a22d1f r600g: Remove unnecessary headers. 2010-10-01 17:06:33 -07:00
Vinson Lee 20846a8ce1 r600g: Remove unused variable.
Fixes this GCC warning.
r600_shader.c: In function 'tgsi_split_literal_constant':
r600_shader.c:818: warning: unused variable 'index'
2010-10-01 17:02:01 -07:00
Ian Romanick 1ca6cbec1b rgtc: Detect RGTC formats as color formats and as compressed formats 2010-10-01 16:55:35 -07:00
Ian Romanick 5ebbabc5cc mesa: Trivial correction to comment 2010-10-01 16:55:35 -07:00
Ian Romanick 69c78bf2c2 mesa: Fix misplaced #endif
If FEATURE_texture_s3tc is not defined, FXT1 formats would erroneously
fall through to the MESA_FORMAT_RGBA_FLOAT32 case.
2010-10-01 16:55:35 -07:00
Ian Romanick 7c6147014a ARB_texture_rg: Add GL_COMPRESSED_{RED,RG} cases in _mesa_is_color_format 2010-10-01 16:55:35 -07:00
Ian Romanick e2a054b70c mesa: Add ARB_texture_compression_rgtc as an alias for EXT_texture_compression_rgtc
Change the name in the extension tracking structure to ARB (from EXT).
2010-10-01 16:55:35 -07:00
Vinson Lee e5fd15199d savage: Remove unnecessary header. 2010-10-01 16:57:19 -07:00
Vinson Lee 841503fddf glsl: Remove unnecessary header. 2010-10-01 16:27:58 -07:00
Ian Romanick c77cd9ec10 i965: Enable GL_ARB_texture_rg 2010-10-01 15:49:13 -07:00
Ian Romanick 9ef390dc14 mesa: Enable GL_ARB_texture_rg in software paths 2010-10-01 15:49:13 -07:00
Ian Romanick 421f4d8dc1 ARB_texture_rg: Allow RED and RG textures as FBO color buffer attachments 2010-10-01 15:49:13 -07:00
Ian Romanick 5d1387b2da ARB_texture_rg: Add R8, R16, RG88, and RG1616 internal formats 2010-10-01 15:49:13 -07:00
Ian Romanick 214a33f610 ARB_texture_rg: Handle RED and RG the same as RGB for tex env 2010-10-01 15:49:13 -07:00
Ian Romanick cd5dea6401 ARB_texture_rg: Add GL_RED as a valid GL_DEPTH_TEXTURE_MODE 2010-10-01 15:49:13 -07:00
Ian Romanick cc6f13def5 ARB_texture_rg: Add GL_TEXTURE_{RED,GREEN}_SIZE query support 2010-10-01 15:49:12 -07:00
Ian Romanick 3ebbc176f9 ARB_texture_rg: Correct some errors in RED / RG internal format handling
Fixes several problems:

The half-float, float, and integer internal formats depend on
ARB_texture_rg and other extensions.

RG_INTEGER is not a valid internal format.

Generic compressed formats depend on ARB_texture_rg, not
EXT_texture_compression_rgtc.

Use GL_RED instead of GL_R.
2010-10-01 15:49:12 -07:00
Ian Romanick bb45ab0a96 ARB_texture_rg: Add GLX protocol support 2010-10-01 15:49:12 -07:00
Nicolas Kaiser 96efa8a923 i965g: use Elements macro instead of manual sizeofs
Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Brian Paul <brianp@vmware.com>
2010-10-01 16:41:13 -06:00
Eric Anholt e9bcc83289 i965: Fix up copy'n'pasteo from moving coordinate setup around for gen4. 2010-10-01 14:09:00 -07:00
Eric Anholt bfd9715c3c i965: Add real support for pre-gen5 texture sampling to the new FS.
Fixes 36 testcases, including glsl-fs-shadow2d*-bias which fail on the
Mesa IR backend.
2010-10-01 14:02:48 -07:00
richard 92eb07a281 evergreen : fix z format setting, enable stencil. 2010-10-01 16:10:02 -04:00
Eric Anholt 8f63a44636 i965: Pre-gen6, map VS outputs (not FS inputs) to URB setup in the new FS.
We should fix the SF to actually give us just the data we need, but
this fixes regressions in the new FS until then.

Fixes:
glsl-kwin-blur
glsl-routing
2010-10-01 12:21:51 -07:00
Eric Anholt ff5ce9289b i965: Also increment attribute location when skipping unused slots.
Fixes glsl1-texcoord varying.
2010-10-01 12:19:21 -07:00
Eric Anholt 354c40a624 i965: Fix the gen6 jump size for BREAK/CONT in new FS.
Since gen5, jumps are in increments of 64 bits instead of increments
of 128-bit instructions.
2010-10-01 12:19:21 -07:00
Eric Anholt efc4a6f790 i965: Add gen6 attribute interpolation to new FS backend.
Untested, since my hardware is not booting at the moment.
2010-10-01 12:19:21 -07:00
Jerome Glisse 29b491bd03 r600g: indentation fixes
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-01 10:26:58 -04:00