Commit Graph

72 Commits

Author SHA1 Message Date
Brian Paul 4461442849 llvmpipe: implement scissor testing
The scissor test is implemented as another per-quad operation in
the JIT code.  The four scissor box params are passed via the
lp_jit_context.  In the JIT code we compare the quad's x/y coords
against the clip bounds and create a new in/out mask that's AND'd
with the main quad mask.

Note: we should also do scissor testing in the triangle setup code
to improve efficiency.  That's not done yet.
2010-01-14 19:15:00 -07:00
Brian Paul 4769328fe1 llvmpipe: comments 2010-01-13 15:10:57 -07:00
José Fonseca 7df4c88088 llvmpipe: Opaque shader implies complete colormask too. 2010-01-13 22:07:24 +00:00
José Fonseca a1acbff299 llvmpipe: Reset the bin when shading a whole tile with an opaque shader. 2010-01-13 21:51:47 +00:00
Brian Paul 4061ca02dd llvmpipe: silence unused var warnings 2010-01-12 13:01:32 -07:00
Brian Paul 5cf4630969 llvmpipe: disable the all in/out test code for now
It's still faster not to try to special case the "all pixels are
known to be inside the triangle" case.
2010-01-11 15:30:56 -07:00
Brian Paul 9a10d14a44 llvmpipe: move, update comments 2010-01-11 15:30:17 -07:00
Brian Paul 3b5d849268 llvmpipe: refactor generate_fragment() code
This will make it easier to generate multiple versions of the fragment
code per variant.
2010-01-11 13:16:02 -07:00
Brian Paul 46b5bd6cad llvmpipe: do the all-in test on the scalar c0 instead of vector c0
This still isn't faster, but committing it for posterity.
2010-01-11 12:59:39 -07:00
Keith Whitwell 86f450060d llvmpipe: force constant interpolation of flatshade colors
Nice speedup for gears.
2010-01-11 12:12:59 +00:00
Keith Whitwell c1a0441602 llvmpipe: initial mrt support
Non-mrt apps work, and the code looks correct, but not many mrt test apps
handy atm...
2010-01-10 17:22:09 +00:00
Brian Paul f4321fbd96 llvmpipe: optimize case when all four pixels are inside the triangle
When the incoming c0,c1,c2 values are equal to INT_MIN it means that
all pixels are inside the triangle.  Thus we can skip the detailed
pixel inside/outside triangle tests.  Use the new lp_build_if()/endif()
functions to generate the branching code.

The code is disabled ATM however because it's actually a little slower
than the original code.  A little more tuning may fix that though...
2010-01-08 14:49:34 -07:00
José Fonseca 080c40ab32 Merge remote branch 'origin/master' into lp-binning
Conflicts:
	src/gallium/auxiliary/util/u_surface.c
	src/gallium/drivers/llvmpipe/Makefile
	src/gallium/drivers/llvmpipe/SConscript
	src/gallium/drivers/llvmpipe/lp_bld_arit.c
	src/gallium/drivers/llvmpipe/lp_bld_flow.c
	src/gallium/drivers/llvmpipe/lp_bld_interp.c
	src/gallium/drivers/llvmpipe/lp_clear.c
	src/gallium/drivers/llvmpipe/lp_context.c
	src/gallium/drivers/llvmpipe/lp_context.h
	src/gallium/drivers/llvmpipe/lp_draw_arrays.c
	src/gallium/drivers/llvmpipe/lp_jit.c
	src/gallium/drivers/llvmpipe/lp_jit.h
	src/gallium/drivers/llvmpipe/lp_prim_vbuf.c
	src/gallium/drivers/llvmpipe/lp_setup.c
	src/gallium/drivers/llvmpipe/lp_setup_point.c
	src/gallium/drivers/llvmpipe/lp_state.h
	src/gallium/drivers/llvmpipe/lp_state_blend.c
	src/gallium/drivers/llvmpipe/lp_state_derived.c
	src/gallium/drivers/llvmpipe/lp_state_fs.c
	src/gallium/drivers/llvmpipe/lp_state_sampler.c
	src/gallium/drivers/llvmpipe/lp_state_surface.c
	src/gallium/drivers/llvmpipe/lp_tex_cache.c
	src/gallium/drivers/llvmpipe/lp_tex_cache.h
	src/gallium/drivers/llvmpipe/lp_tex_sample.h
	src/gallium/drivers/llvmpipe/lp_tile_cache.c
2010-01-08 15:42:57 +00:00
José Fonseca 7bd7e2da75 llvmpipe: Axe texture sampling code inherited from softpipe.
Was used only as a reference, since texture sampling is now code generated.
Already axed in the lp-binning branch too.

This fixes the llvmpipe build after recent sampling changes.
2010-01-07 15:35:24 +00:00
Keith Whitwell 5ce0380a0f llvmpipe: merge setup and draw vbuf submodules
The setup tiling engine is now plugged directly into the draw module
as a rendering backend.

Removed a couple of layering violations such that the setup code no
longer reaches out into the surrounding llvmpipe state or context.
2010-01-06 16:44:43 +00:00
Michal Krol 4e014c0a14 pipe_sampler_state::compare_mode is not a boolean enable flag.
It's a 1-bit enum.
2010-01-06 16:11:26 +01:00
Brian Paul 25024d9482 Merge branch 'mesa_7_7_branch'
Conflicts:
	configs/darwin
	src/gallium/auxiliary/util/u_clear.h
	src/gallium/state_trackers/xorg/xorg_exa_tgsi.c
	src/mesa/drivers/dri/i965/brw_draw_upload.c
2009-12-31 09:02:27 -07:00
Vinson Lee 31d1822473 llvmpipe: Silence compiler warnings. 2009-12-28 00:44:30 -08:00
José Fonseca 080703e398 llvmpipe: Treat state changes systematically.
That is:
- check for no op
- update/flush draw module
- update bound state and mark it as dirty

In particular flushing the draw module is important since it may contain
unflushed primitives which would otherwise be draw with wrong state.
2009-12-26 21:06:46 +00:00
Zack Rusin 89d8577fb3 gallium: add geometry shader support to gallium 2009-12-25 05:52:16 -05:00
Brian Paul 5771f3d483 llvmpipe: remove unused code, added comments, etc 2009-12-17 10:52:50 -07:00
Brian Paul b9d33db0a4 llvmpipe: improve the in/out test a little
Instead of:
  s = c + step
  m = s > 0
Do:
  m = step > c  (with negated c)
2009-12-17 08:17:04 -07:00
Brian Paul ab94381930 llvmpipe: do final the pixel in/out triangle test in the fragment shader
The test to determine which of the pixels in a 2x2 quad is now done in
the fragment shader rather than in the calling C code.  This is a little
faster but there's a few more things to do.

Note that the step[] array elements are in a different order now.  Rather
than being in row-major order for the 4x4 grid, they're in "quad-major"
order.  The setup of the step arrays is a little more complicated now.
So is the course/intermediate tile test code, but some lookup tables
help with that.

Next steps:
 - early-cull 2x2 quads which are totally outside the triangle.
 - skip the in/out test for fully contained quads
 - make the in/out comparison code tighter/faster.
2009-12-16 16:10:05 -07:00
José Fonseca 2584c5bd25 llvmpipe: add LP_DEBUG env var
Cherry-picked from dec35d04ae.
2009-12-16 15:06:17 +00:00
Brian Paul 51410a254c llvmpipe: fix blend debug strings 2009-12-03 14:13:22 -07:00
Brian Paul 866e6856d3 llvmpipe: execute shaders on 4x4 blocks instead of 8x2
This matches the convention used by the recursive rasterizer.
Also fixed assorted typos, comments, etc.
Now tri-z.c, gears.c, etc look basically right but there's still some
cracks in triangle rasterization.
2009-12-02 15:13:47 -07:00
José Fonseca 4ae3e88dc9 llvmpipe: Use assert instead of abort. Only verify functions on debug builds. 2009-11-24 14:25:21 +00:00
José Fonseca 2282fb7710 llvmpipe: Use the generic conversion routine for depths.
This allows for z32f depth format to work correctly.
2009-11-24 14:25:20 +00:00
José Fonseca 88e08d7c6d llvmpipe: Human friendlier sampler state dump. 2009-10-25 12:27:14 +00:00
José Fonseca 5fcb75758c llvmpipe: Dump the sampler state of the shader key. 2009-10-25 11:49:01 +00:00
José Fonseca 8599969582 llvmpipe: Get jit_context/jit_function across the rasterizer. 2009-10-09 15:53:53 +01:00
Keith Whitwell dec35d04ae llvmpipe: add LP_DEBUG env var 2009-10-09 14:59:35 +01:00
José Fonseca d904ed88c1 llvmpipe: Pass state to setup. 2009-10-09 13:41:33 +01:00
José Fonseca c4d54b62f5 llvmpipe: Eliminate constant mapping/unmapping. 2009-10-09 13:25:15 +01:00
José Fonseca 21489d2275 llvmpipe: Remove quad headers. 2009-10-08 19:56:01 +01:00
José Fonseca 69588d7ed5 llvmpipe: Eliminate constant mapping/unmapping. 2009-10-09 11:29:33 +01:00
Keith Whitwell 4456006ba6 gallium: remove depth.occlusion_count flag
This was redundant as drivers can just keep track of whether they are
inside a begin/end query pair.  We want to add more query types later
and also support nested queries, none of which map well onto a flag like
this.  No driver appeared to be using the flag.
2009-10-01 14:34:23 +01:00
José Fonseca a02ecdf8c2 llvmpipe: First verify LLVM IR, only then run optimizing passes. 2009-09-29 17:28:15 +01:00
José Fonseca b4835ea03d llvmpipe: Make lp_type a regular union.
Union not worth the hassle of violating C99 or adding a name to
the structure.
2009-09-14 11:05:38 +01:00
José Fonseca 6a405b4a21 llvmpipe: Fix alpha test. 2009-09-10 13:35:39 +01:00
José Fonseca 4c3a48ad0c llvmpipe: Mask out color channels not present in the color buffer. 2009-09-10 12:37:44 +01:00
José Fonseca c3c80c5c22 llvmpipe: Skip blending when mask is zero.
This increases quake3 timedemo fps another 10%.
2009-09-10 12:01:42 +01:00
José Fonseca 8e6b925d2a llvmpipe: Proper control flow builders.
New control flow helper functions which keep track of all variables
and generate the correct Phi functions.

This re-enables skipping the fs execution of quads masked out by
the rasterizer, early z testing, and kill opcode.

This yields a performance improvement of around 20%.
2009-09-10 11:44:03 +01:00
José Fonseca cdbbcdf3bd llvmpipe: Include zsbuf's format in the fragment shader key. 2009-09-09 21:48:50 +01:00
José Fonseca e4c76c02f7 llvmpipe: Code generate the texture sampling inside the shader.
Finally a substantial performance improvement: framerates of apps using
texturing tripled, and furthermore, enabling/disabling texturing only
affects around 15% of the framerate, which means the bottleneck is now
somewhere else.

Generated texture sampling code is not complete though -- we always
sample from the base level -- so final figures will be different.
2009-09-07 15:02:08 +01:00
José Fonseca 8be72bb764 llvmpipe: Further abstract the texture sampling generation from TGSI translation. 2009-09-07 15:02:06 +01:00
José Fonseca c40eddd294 llvmpipe: Isolate sampling from TGSI translation. 2009-08-29 09:21:42 +01:00
José Fonseca 8aa62cead7 llvmpipe: Fix shader variant key construction.
Fixes the blank screen on non-64bit mode.
2009-08-29 09:21:42 +01:00
José Fonseca f85c5f8621 llvmpipe: Factor out and optimize the input interpolation.
Special attention is given to the interpolation of side by side quads.
Multiplications are made only for the first quad. Interpolation of
inputs for posterior quads are done exclusively with additions, and
perspective divide if necessary.
2009-08-29 09:21:41 +01:00
José Fonseca 03180dca7a llvmpipe: Pre-declare fetch_texel. 2009-08-29 09:21:41 +01:00