Commit Graph

11252 Commits

Author SHA1 Message Date
Keith Whitwell aa4cb5e2d8 llvmpipe: try to be sensible about whether to branch after mask updates
Don't branch more than once in quick succession.  Don't branch at the
end of the shader.
2010-10-09 11:44:45 +01:00
Keith Whitwell 2ef6f75ab4 gallivm: simpler uint8->float conversions
LLVM seems to finds it easier to reason about these than our
mantissa-manipulation code.
2010-10-09 11:44:45 +01:00
Keith Whitwell c79f162367 gallivm: prefer blendvb for integer arguments 2010-10-09 11:44:45 +01:00
Keith Whitwell d2cf757f44 gallivm: specialized x8z24 depthtest path
Avoid unnecessary masking of non-existant stencil component.
2010-10-09 11:44:09 +01:00
Keith Whitwell 954965366f llvmpipe: dump fragment shader ir and asm when LP_DEBUG=fs
Better than GALLIVM_DEBUG if you're only interested in fragment shaders.
2010-10-09 11:43:23 +01:00
Keith Whitwell 6da29f3611 llvmpipe: store zero into all alloca'd values
Fixes slowdown in isosurf with earlier versions of llvm.
2010-10-09 11:43:23 +01:00
Keith Whitwell 40d7be5261 llvmpipe: use alloca for fs color outputs
Don't try to emit our own phi's, let llvm mem2reg do it for us.
2010-10-09 11:43:23 +01:00
Keith Whitwell 8009886b00 llvmpipe: defer attribute interpolation until after mask and ztest
Don't calculate 1/w for quads which aren't visible...
2010-10-09 11:42:48 +01:00
José Fonseca d0bfb3c514 llvmpipe: Prevent z > 1.0
The current interpolation schemes causes precision loss.

Changing the operation order helps, but does not completely avoid the
problem.

The only short term solution is to clamp z to 1.0.

This is unfortunate, but probably unavoidable until interpolation is
improved.
2010-10-09 09:35:41 +01:00
José Fonseca 34c11c87e4 gallivm: Do size computations simultanously for all dimensions (AoS).
Operate simultanouesly on <width, height, depth> vector as much as possible,
instead of doing the operations on vectors with broadcasted scalars.

Also do the 24.8 fixed point scalar with integer shift of the texture size,
for unnormalized coordinates.

AoS path only for now -- the same thing can be done for SoA.
2010-10-09 09:34:31 +01:00
Zack Rusin 6316d54056 llvmpipe: fix rasterization of vertical lines on pixel boundaries 2010-10-09 08:19:21 +01:00
Roland Scheidegger ff72c79924 gallivm: make use of new iround code in lp_bld_conv.
Only requires sse2 now.
2010-10-09 00:36:38 +02:00
Roland Scheidegger 175cdfd491 gallivm: optimize soa linear clamp to edge wrap mode a bit
Clamp against 0 instead of -0.5, which simplifies things.
The former version would have resulted in both int coords being zero
(in case of coord being smaller than 0) and some "unused" weight value,
whereas now the int coords will be 0 and 1, but weight will be 0, hence the
lerp should produce the same value.
Still not happy about differences between normalized and non-normalized...
2010-10-09 00:36:38 +02:00
Roland Scheidegger 2cc6da85d6 gallivm: avoid unnecessary URem in linear wrap repeat case
Haven't looked at what code this exactly generates but URem can't be fast.
Instead of using two URem only use one and replace the second one with
select/add (this is what the corresponding aos code already does).
2010-10-09 00:36:38 +02:00
Roland Scheidegger 318bb080b0 gallivm: more linear tex wrap mode calculation simplification
Rearrange order of operations a bit to make some clamps easier.
All calculations should be equivalent.
Note there seems to be some inconsistency in the clamp to edge case
wrt normalized/non-normalized coords, could potentially simplify this too.
2010-10-09 00:36:38 +02:00
Roland Scheidegger 99ade19e6e gallivm: optimize some tex wrap mode calculations a bit
Sometimes coords are clamped to positive numbers before doing conversion
to int, or clamped to 0 afterwards, in this case can use itrunc
instead of ifloor which is easier. This is only the case for nearest
calculations unfortunately, except linear MIRROR_CLAMP_TO_EDGE which
for the same reason can use a unsigned float build context so the
ifloor_fract helper can reduce this to itrunc in the ifloor helper itself.
2010-10-09 00:36:38 +02:00
Roland Scheidegger 1e17e0c4ff gallivm: replace sub/floor/ifloor combo with ifloor_fract 2010-10-09 00:36:37 +02:00
Roland Scheidegger cb3af2b434 gallivm: faster iround implementation for sse2
sse2 supports round to nearest directly (or rather, assuming default nearest
rounding mode in MXCSR). Use intrinsic to use this rather than round (sse41)
or bit manipulation whenever possible.
2010-10-09 00:36:37 +02:00
Roland Scheidegger 0ed8c56bfe gallivm: fix trunc/itrunc comment
trunc of -1.5 is -1.0 not 1.0...
2010-10-09 00:36:37 +02:00
Vinson Lee 3b16c591a4 r600g: Silence uninitialized variable warning. 2010-10-08 14:17:14 -07:00
Vinson Lee 36b65a373a r600g: Silence uninitialized variable warning. 2010-10-08 14:14:16 -07:00
Vinson Lee 131485efae r600g: Silence uninitialized variable warning. 2010-10-08 14:08:50 -07:00
Vinson Lee 5e90971475 gallivm: Remove unnecessary header. 2010-10-08 14:03:10 -07:00
José Fonseca 3fde8167a5 gallivm: Help for combined extraction and broadcasting.
Doesn't change generated code quality, but saves some typing.
2010-10-08 19:48:16 +01:00
José Fonseca 438390418d llvmpipe: First minify the texture size, then broadcast. 2010-10-08 19:11:52 +01:00
José Fonseca f5b5fb32d3 gallivm: Move into the as much of the second level code as possible.
Also, pass more stuff trhough the sample build context, instead of
arguments.
2010-10-08 19:11:52 +01:00
José Fonseca 6b0c79e058 gallivm: Warn when doing inefficient integer comparisons. 2010-10-08 17:43:15 +01:00
José Fonseca d5ef59d8b0 gallivm: Avoid control flow for two-sided stencil test. 2010-10-08 17:43:15 +01:00
Keith Whitwell ef3407672e llvmpipe: fix off-by-one in tri_16 2010-10-08 17:30:08 +01:00
Keith Whitwell 0ff132e5a6 llvmpipe: add rast_tri_4_16 for small lines and points 2010-10-08 17:30:08 +01:00
Keith Whitwell eeb13e2352 llvmpipe: clean up setup_tri a little 2010-10-08 17:30:08 +01:00
Keith Whitwell e191bf4a85 gallivm: round rather than truncate in new 4x4f->1x16ub conversion path 2010-10-08 17:30:08 +01:00
José Fonseca f91b4266c6 gallivm: Use the wrappers for SSE pack intrinsics.
Fixes assertion failures on LLVM 2.6.
2010-10-08 17:30:08 +01:00
Keith Whitwell 607e3c542c gallivm: special case conversion 4x4f to 1x16ub
Nice reduction in the number of operations required for final color
output in many shaders.
2010-10-08 17:30:08 +01:00
Keith Whitwell 29d6a1483d llvmpipe: avoid overflow in triangle culling
Avoid multiplying fixed-point values.  Calculate triangle area in
floating point use that for culling.

Lift area calculations up a level as we are already doing this in the
triangle_both() case.

Would like to share the calculated area with attribute interpolation,
but the way the code is structured makes this difficult.
2010-10-08 17:30:08 +01:00
Keith Whitwell ad6730fadb llvmpipe: fail gracefully on oom in scene creation 2010-10-08 17:26:29 +01:00
José Fonseca eb605701aa gallivm: Implement brilinear filtering. 2010-10-08 15:50:28 +01:00
José Fonseca c8179ef5e8 gallivm: Fix copy'n'paste typo in previous commit. 2010-10-08 14:09:22 +01:00
José Fonseca df7a2451b1 gallivm: Clamp mipmap level and zero mip weight simultaneously. 2010-10-08 14:06:38 +01:00
José Fonseca 0d84b64a4f gallivm: Use lp_build_ifloor_fract for lod computation.
Forgot this one before.
2010-10-08 14:06:38 +01:00
José Fonseca 4f2e2ca4e3 gallivm: Don't compute the second mipmap level when frac(lod) == 0 2010-10-08 14:06:37 +01:00
José Fonseca 05fe33b71c gallivm: Simplify lp_build_mipmap_level_sizes' interface. 2010-10-08 14:06:37 +01:00
José Fonseca 4eb222a3e6 gallivm: Do not do mipfiltering when magnifying.
If lod < 0, then invariably follows that ilevel0 == ilevel1 == 0.
2010-10-08 14:06:37 +01:00
Vinson Lee 1f01f5cfcf r600g: Remove unnecessary header. 2010-10-08 04:56:49 -07:00
Dave Airlie 8d6a38d7b3 r600g: drop width/height per level storage.
these aren't used anywhere, so just waste memory.
2010-10-08 19:55:05 +10:00
Dave Airlie 1ae5cc2e67 r600g: add some RG texture format support. 2010-10-08 09:37:02 +10:00
José Fonseca 321ec1a224 gallivm: Vectorize the rho computation. 2010-10-07 22:08:42 +01:00
Dave Airlie 51f9cc4759 r600g: fix Z export enable bits.
we should be checking output array not input to decide.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-07 15:32:05 +10:00
Dave Airlie 97eea87bde r600g: use format from the sampler view not from the texture.
we want to use the format from the sampler view which isn't always the
same as the texture format when creating sampler views.
2010-10-07 15:17:28 +10:00
Andre Maasikas 84457701b0 r600g: fix evergreen interpolation setup
interp data is stored in gpr0 so first interp overwrote it
and subsequent ones got wrong values

reserve register 0 so it's not used for attribs.
alternative is to interpolate attrib0 last (reverse, as r600c does)
2010-10-07 07:51:32 +03:00