Commit Graph

34635 Commits

Author SHA1 Message Date
José Fonseca 124adf253c gallivm: Fix a long standing bug with nested if-then-else emission.
We can't patch true-block at end-if time, as there is no guarantee that
the block at the beginning of the true stanza is the same at the end of
the true stanza -- other control flow elements may have been emitted half
way the true stanza.

Although this bug surfaced recently with the commit to skip mip filtering
when lod is an integer the bug was always there, although probably it
was avoided until now: e.g., cubemap selection nests if-then-else on the
else stanza, which does not suffer from the same problem.
2010-10-10 18:48:02 +01:00
Francisco Jerez e2acc7be26 dri/nv10: Fake fast Z clears for pre-nv17 cards. 2010-10-10 04:14:34 +02:00
Francisco Jerez 35a1893fd1 dri/nouveau: Minor cleanup. 2010-10-10 01:48:01 +02:00
José Fonseca 307df6a858 gallivm: Cleanup the rest of the flow module. 2010-10-09 21:39:14 +01:00
José Fonseca d0ea464159 gallivm: Simplify if/then/else implementation.
No need for for a flow stack anymore.
2010-10-09 21:14:05 +01:00
José Fonseca 1949f8c315 gallivm: Factor out the SI->FP texture size conversion for SoA path too 2010-10-09 20:26:11 +01:00
José Fonseca d45c379027 gallivm: Remove support for Phi generation.
Simply rely on mem2reg pass. It's easier and more reliable.
2010-10-09 20:14:03 +01:00
José Fonseca ea7b49028b gallivm: Use varilables instead of Phis for cubemap selection. 2010-10-09 19:53:21 +01:00
José Fonseca cc40abad51 gallivm: Don't generate Phis for execution mask. 2010-10-09 12:55:31 +01:00
José Fonseca 679dd26623 gallivm: Special bri-linear computation path for unmodified rho. 2010-10-09 12:13:00 +01:00
José Fonseca 81a09c8a97 gallivm: Less code duplication in log computation. 2010-10-09 12:12:59 +01:00
José Fonseca 52427f0ba7 util: Defined M_SQRT2 when not available. 2010-10-09 12:12:59 +01:00
José Fonseca 53d7f5e107 gallivm: Handle code have ret correctly.
Stop disassembling on unconditional backwards jumps.
2010-10-09 12:12:59 +01:00
José Fonseca edba53024f llvmpipe: Fix MSVC build. Enable the new SSE2 code on non SSE3 systems. 2010-10-09 12:12:58 +01:00
Keith Whitwell 2de720dc8f llvmpipe: simplified SSE2 swz/unswz routines
We've been using these in the linear path for a while now.  Based on
Chris's SSSE3 code, but using only sse2 opcodes.  Speed seems to be
identical, but code is simpler & removes dependency on SSE3.

Should be easier to extend to other rgba8 formats.
2010-10-09 12:12:58 +01:00
Keith Whitwell 5b7eb868fd llvmpipe: clean up shader pre/postamble, try to catch more early-z
Specifically, can do early-depth-test even when alpahtest or
kill-pixel are active, providing we defer the actual z write until the
final mask is avaialable.

Improves demos/fire.c especially in the case where you get close to
the trees.
2010-10-09 11:44:45 +01:00
Keith Whitwell aa4cb5e2d8 llvmpipe: try to be sensible about whether to branch after mask updates
Don't branch more than once in quick succession.  Don't branch at the
end of the shader.
2010-10-09 11:44:45 +01:00
Keith Whitwell 2ef6f75ab4 gallivm: simpler uint8->float conversions
LLVM seems to finds it easier to reason about these than our
mantissa-manipulation code.
2010-10-09 11:44:45 +01:00
Keith Whitwell c79f162367 gallivm: prefer blendvb for integer arguments 2010-10-09 11:44:45 +01:00
Keith Whitwell d2cf757f44 gallivm: specialized x8z24 depthtest path
Avoid unnecessary masking of non-existant stencil component.
2010-10-09 11:44:09 +01:00
Keith Whitwell 954965366f llvmpipe: dump fragment shader ir and asm when LP_DEBUG=fs
Better than GALLIVM_DEBUG if you're only interested in fragment shaders.
2010-10-09 11:43:23 +01:00
Keith Whitwell 6da29f3611 llvmpipe: store zero into all alloca'd values
Fixes slowdown in isosurf with earlier versions of llvm.
2010-10-09 11:43:23 +01:00
Keith Whitwell 40d7be5261 llvmpipe: use alloca for fs color outputs
Don't try to emit our own phi's, let llvm mem2reg do it for us.
2010-10-09 11:43:23 +01:00
Keith Whitwell 8009886b00 llvmpipe: defer attribute interpolation until after mask and ztest
Don't calculate 1/w for quads which aren't visible...
2010-10-09 11:42:48 +01:00
José Fonseca d0bfb3c514 llvmpipe: Prevent z > 1.0
The current interpolation schemes causes precision loss.

Changing the operation order helps, but does not completely avoid the
problem.

The only short term solution is to clamp z to 1.0.

This is unfortunate, but probably unavoidable until interpolation is
improved.
2010-10-09 09:35:41 +01:00
José Fonseca 34c11c87e4 gallivm: Do size computations simultanously for all dimensions (AoS).
Operate simultanouesly on <width, height, depth> vector as much as possible,
instead of doing the operations on vectors with broadcasted scalars.

Also do the 24.8 fixed point scalar with integer shift of the texture size,
for unnormalized coordinates.

AoS path only for now -- the same thing can be done for SoA.
2010-10-09 09:34:31 +01:00
Zack Rusin 6316d54056 llvmpipe: fix rasterization of vertical lines on pixel boundaries 2010-10-09 08:19:21 +01:00
Vinson Lee e7843363a5 i965: Initialize member variables.
Fixes these GCC warnings.
brw_wm_fp.c: In function 'search_or_add_const4f':
brw_wm_fp.c:92: warning: 'reg.Index2' is used uninitialized in this function
brw_wm_fp.c:84: note: 'reg.Index2' was declared here
brw_wm_fp.c:92: warning: 'reg.RelAddr2' is used uninitialized in this function
brw_wm_fp.c:84: note: 'reg.RelAddr2' was declared here
2010-10-08 16:40:29 -07:00
Vinson Lee 5abd498c47 i965: Silence unused variable warning on non-debug builds.
Fixes this GCC warning.
brw_vs.c: In function 'do_vs_prog':
brw_vs.c:46: warning: unused variable 'ctx'
2010-10-08 16:30:59 -07:00
Vinson Lee 978ffa1d61 i965: Silence unused variable warning on non-debug builds.
Fixes this GCC warning.
brw_eu_emit.c: In function 'brw_math2':
brw_eu_emit.c:1189: warning: unused variable 'intel'
2010-10-08 16:02:59 -07:00
Vinson Lee 220c0834a4 i915: Silence unused variable warning in non-debug builds.
Fixes this GCC warning.
i915_vtbl.c: In function 'i915_assert_not_dirty':
i915_vtbl.c:670: warning: unused variable 'dirty'
2010-10-08 15:49:02 -07:00
Roland Scheidegger ff72c79924 gallivm: make use of new iround code in lp_bld_conv.
Only requires sse2 now.
2010-10-09 00:36:38 +02:00
Roland Scheidegger 175cdfd491 gallivm: optimize soa linear clamp to edge wrap mode a bit
Clamp against 0 instead of -0.5, which simplifies things.
The former version would have resulted in both int coords being zero
(in case of coord being smaller than 0) and some "unused" weight value,
whereas now the int coords will be 0 and 1, but weight will be 0, hence the
lerp should produce the same value.
Still not happy about differences between normalized and non-normalized...
2010-10-09 00:36:38 +02:00
Roland Scheidegger 2cc6da85d6 gallivm: avoid unnecessary URem in linear wrap repeat case
Haven't looked at what code this exactly generates but URem can't be fast.
Instead of using two URem only use one and replace the second one with
select/add (this is what the corresponding aos code already does).
2010-10-09 00:36:38 +02:00
Roland Scheidegger 318bb080b0 gallivm: more linear tex wrap mode calculation simplification
Rearrange order of operations a bit to make some clamps easier.
All calculations should be equivalent.
Note there seems to be some inconsistency in the clamp to edge case
wrt normalized/non-normalized coords, could potentially simplify this too.
2010-10-09 00:36:38 +02:00
Roland Scheidegger 99ade19e6e gallivm: optimize some tex wrap mode calculations a bit
Sometimes coords are clamped to positive numbers before doing conversion
to int, or clamped to 0 afterwards, in this case can use itrunc
instead of ifloor which is easier. This is only the case for nearest
calculations unfortunately, except linear MIRROR_CLAMP_TO_EDGE which
for the same reason can use a unsigned float build context so the
ifloor_fract helper can reduce this to itrunc in the ifloor helper itself.
2010-10-09 00:36:38 +02:00
Roland Scheidegger 1e17e0c4ff gallivm: replace sub/floor/ifloor combo with ifloor_fract 2010-10-09 00:36:37 +02:00
Roland Scheidegger cb3af2b434 gallivm: faster iround implementation for sse2
sse2 supports round to nearest directly (or rather, assuming default nearest
rounding mode in MXCSR). Use intrinsic to use this rather than round (sse41)
or bit manipulation whenever possible.
2010-10-09 00:36:37 +02:00
Roland Scheidegger 0ed8c56bfe gallivm: fix trunc/itrunc comment
trunc of -1.5 is -1.0 not 1.0...
2010-10-09 00:36:37 +02:00
Vinson Lee 0f4984a0fb i915: Silence unused variable warning in non-debug builds.
Fixes this GCC warning.
i830_vtbl.c: In function 'i830_assert_not_dirty':
i830_vtbl.c:704: warning: unused variable 'i830'
2010-10-08 15:35:35 -07:00
Ian Romanick 0ea8b99332 glsl: Remove const decoration from inlined function parameters
The constness of the function parameter gets inlined with the rest of
the function.  However, there is also an assignment to the parameter.
If this occurs inside a loop the loop analysis code will get confused
by the assignment to a read-only variable.

Fixes bugzilla #30552.

NOTE: this is a candidate for the 7.9 branch.
2010-10-08 14:29:11 -07:00
Ian Romanick dc459f8756 intel: Enable GL_ARB_explicit_attrib_location 2010-10-08 14:21:23 -07:00
Ian Romanick dbc6c9672d main: Enable GL_ARB_explicit_attrib_location for swrast 2010-10-08 14:21:23 -07:00
Ian Romanick 68a4fc9d5a glsl: Add linker support for explicit attribute locations 2010-10-08 14:21:23 -07:00
Ian Romanick eee68d3631 glsl: Track explicit location in AST to IR translation 2010-10-08 14:21:23 -07:00
Ian Romanick 2b45ba8bce glsl: Regenerate files changes by previous commit 2010-10-08 14:21:23 -07:00
Ian Romanick 7f68cbdc4d glsl: Add parser support for GL_ARB_explicit_attrib_location layouts
Only layout(location=#) is supported.  Setting the index requires GLSL
1.30 and GL_ARB_blend_func_extended.
2010-10-08 14:21:22 -07:00
Ian Romanick eafebed5bd glcpp: Regenerate files changes by previous commit 2010-10-08 14:21:22 -07:00
Ian Romanick e0c9f67be5 glcpp: Add the define for ARB_explicit_attrib_location when present 2010-10-08 14:21:22 -07:00
Ian Romanick 5ed6610d11 glsl: Regenerate files modified by previous commits 2010-10-08 14:21:22 -07:00