Dave Airlie
6da8129b3c
r600g: add missing eg reg definition
2010-10-13 17:45:10 +10:00
Dave Airlie
92e729bba5
r600g: evergreen add stencil export bit
2010-10-13 17:40:32 +10:00
Dave Airlie
88c1b32c62
r600g: use blitter for hw copy region
...
at the moment depth copies are failing (piglit depth-level-clamp)
so use the fallback for now until get some time to investigate.
2010-10-13 15:55:49 +10:00
Dave Airlie
f8778eeb40
r600g: drop all use of unsigned long
...
this changes size on 32/64 bit so is definitely no what you want to use here.
2010-10-13 15:55:48 +10:00
Dave Airlie
e9acf9a3bb
r600g: fix transfer stride.
...
fixes segfaults
2010-10-13 15:55:48 +10:00
Dave Airlie
e3b089126c
r600g: remove bpt and start using pitch_in_bytes/pixels.
...
this mirror changes in r300g, bpt is kinda useless when it comes to some
of the non-simple texture formats.
2010-10-13 15:55:48 +10:00
Dave Airlie
fa797f12b3
r600g: rename pitch in texture to pitch_in_bytes
2010-10-13 15:55:47 +10:00
Dave Airlie
6a0066a69f
r600g: use common texture object create function
2010-10-13 15:55:47 +10:00
Dave Airlie
771dd89881
r600g: split out miptree setup like r300g
...
just a cleanup step towards tiling
2010-10-13 15:55:47 +10:00
Dave Airlie
9979d60c0e
r600g: add copy into tiled texture
2010-10-13 15:55:46 +10:00
Dave Airlie
5604276670
r600g: the vs/ps const arrays weren't actually being used.
...
completely removed them.
2010-10-13 15:56:12 +10:00
Dave Airlie
d59498b780
r600g: reduce size of context structure.
...
this thing will be in the cache a lot, so having massive big struct
arrays inside it won't be helping anyone.
2010-10-13 15:25:00 +10:00
Vinson Lee
8c107e6ca6
tdfx: Silence unused variable warning on non-debug builds.
...
Fixes this GCC warning.
tdfx_texman.c: In function 'tdfxTMMoveOutTM_NoLock':
tdfx_texman.c:897: warning: unused variable 'shared'
2010-10-12 22:23:21 -07:00
Dave Airlie
c8d4108fbe
r600g: store samplers/views across blit when we need to modify them
...
also fixup framebuffer state copies to avoid bad state.
2010-10-13 15:11:30 +10:00
Dave Airlie
a8d1d7253e
r600g: fix scissor/cliprect confusion
...
gallium calls them scissors, but r600 hw like r300 is better off using
cliprects to implement them as we can turn them on/off a lot easier.
2010-10-13 15:11:30 +10:00
Dave Airlie
833b4fc11e
r600g: fix depth0 setting
2010-10-13 15:11:30 +10:00
Vinson Lee
71fa3f8fe2
r300: Silence uninitialized variable warning.
...
Fixes this GCC warning.
r300_state.c: In function 'r300InvalidateState':
r300_state.c:2247: warning: 'hw_format' may be used uninitialized in this function
r300_state.c:2247: note: 'hw_format' was declared here
2010-10-12 22:02:27 -07:00
Brian Paul
39de9251c4
mesa: reformatting, comments, code movement
2010-10-12 19:04:05 -06:00
Brian Paul
048a90c1cb
draw/llvmpipe: replace DRAW_MAX_TEXTURE_LEVELS with PIPE_MAX_TEXTURE_LEVELS
...
There's no apparent reason for the former to exist. And they didn't
even have the same value.
2010-10-12 19:04:05 -06:00
Brian Paul
50f221a01b
gallivm: remove newlines
2010-10-12 19:04:05 -06:00
Roland Scheidegger
c1549729ce
gallivm: fix different handling of [non]normalized coords in linear soa path
...
There seems to be no reason for it, so do same math for both
(except the scale mul, of course).
2010-10-13 02:35:05 +02:00
Brian Paul
1ca5f7cc31
mesa: remove assertion w/ undeclared variable texelBytes
2010-10-12 18:32:06 -06:00
Dave Airlie
5f612f5c00
st/mesa: enable stencil shader export extension if supported
2010-10-13 09:30:05 +10:00
Dave Airlie
d9671863ea
glsl: add support for shader stencil export
...
This adds proper support for the GL_ARB_shader_stencil_export extension
to the GLSL compiler. Thanks to Ian for pointing out where I need to add things.
2010-10-13 09:30:05 +10:00
Dave Airlie
39d1feb51e
r600g: add shader stencil export support.
2010-10-13 09:30:05 +10:00
Dave Airlie
40acb109de
r600g: add support for S8, X24S8 and S8X24 sampler formats.
2010-10-13 09:30:04 +10:00
Dave Airlie
ef8bb7ada9
st/mesa: use shader stencil export to accelerate shader drawpixels.
...
If the pipe driver has shader stencil export we can accelerate DrawPixels
using it. It tries to pick an S8 texture and works its way to X24S8 and S8X24
if that isn't supported.
2010-10-13 09:30:04 +10:00
Dave Airlie
06642c6175
st/mesa: add option to choose a texture format that we won't render to.
...
We need a texture to put the drawpixels stuff into, an S8 texture is less
memory/bandwidth than the 32-bit X24S8, but we might not be able to render
directly to an S8, so this lets us specify we won't be rendering to this
texture.
2010-10-13 09:30:04 +10:00
Dave Airlie
d8f6ef4565
softpipe: add support for shader stencil export capability
...
this allows softpipe to be used to test shader stencil ref exporting.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-13 09:30:04 +10:00
Dave Airlie
c79e681a68
mesa: improve texstore for 8/24 formats and add texstore for S8.
...
this improves mesa texstore for 8/24 so it can create S24X8/X24S8 variants
by keeping the depth bits static.
it also adds a texstore for S8 so we can write out an S8 texture to use
in the sampler for accel draw pixels to save memory bw.
The logic seems sound here, I've worked it out a few times on paper, though
it would be good to have some review.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-13 09:30:04 +10:00
Dave Airlie
bec341d00c
mesa: add support for FRAG_RESULT_STENCIL.
...
this is needed to add support for stencil shader export.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-13 09:30:03 +10:00
Dave Airlie
d02993c9dc
gallium/util: add S8 tile sampling support.
2010-10-13 09:30:03 +10:00
Dave Airlie
67e71429f1
gallium/format: add X32_S8X24_USCALED format.
...
Has similiar use cases to the S8X24 and X24S8 formats.
2010-10-13 09:30:03 +10:00
Dave Airlie
66a0d1e4b9
gallium/format: add support for X24S8 and S8X24 formats.
...
these formats are needed for hw that can sample and write stencil values.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-13 09:30:03 +10:00
Dave Airlie
4ecb2c105d
gallium/tgsi: add support for stencil writes.
...
this adds the capability + a stencil semantic id, + tgsi scan support.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-13 09:30:02 +10:00
Eric Anholt
43873b53c4
i965: Don't rebase the index buffer to min 0 if any arrays are in VBOs.
...
There was a check to only do the rebase if we didn't have everything
in VBOs, but nexuiz apparently hands us a mix of VBOs and arrays,
resulting in blocking on the GPU to do a rebase.
Improves nexuiz 800x600, high-settings performance on my Ironlake 41%
(+/- 1.3%), from 14.0fps to 19.7fps.
2010-10-12 15:17:35 -07:00
Eric Anholt
3316a54205
intel: Allow CopyTexSubImage to InternalFormat 3/4 textures, like RGB/RGBA.
...
The format selection of the CopyTexSubImage is pretty bogus still, but
this at least avoids software fallbacks in nexuiz, bringing
performance from 7.5fps to 12.8fps on my machine.
2010-10-12 14:08:00 -07:00
Eric Anholt
080e7aface
i965: Fix missing "break;" in i2b/f2b, and missing AND of CMP result.
...
Fixes glsl-fs-i2b.
2010-10-12 13:07:40 -07:00
Ian Romanick
9fea9e5e21
glsl: Fix incorrect assertion
...
This assertion was added in commit f1c1ee11
, but it did not notice
that the array is accessed with 'size-1' instead of 'size'. As a
result, the assertion was off by one. This caused failures in at
least glsl-orangebook-ch06-bump.
2010-10-12 12:50:29 -07:00
Ian Romanick
b2b9b22c10
mesa: Validate assembly shaders when GLSL shaders are used
...
If an GLSL shader is used that does not provide all stages and
assembly shaders are provided for the missing stages, validate the
assembly shaders.
Fixes bugzilla #30787 and piglit tests glsl-invalid-asm0[12].
NOTE: this is a candidate for the 7.9 branch.
2010-10-12 10:54:28 -07:00
Keith Whitwell
7533c37457
llvmpipe: make sure intrinsics code is guarded with PIPE_ARCH_SSE
2010-10-12 18:28:12 +01:00
Thomas Hellstrom
893620e52e
st/xorg: Fix typo
...
Pointed out by Jakob Bornecrantz.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 18:26:05 +02:00
Brian Paul
f1c1ee11d3
ir_to_mesa: assorted clean-ups, const qualifiers, new comments
2010-10-12 09:26:54 -06:00
José Fonseca
6fbd4faf97
gallivm: Name anonymous union.
2010-10-12 16:08:09 +01:00
Brian Paul
0ad9d8b538
st/xlib: add some comments
2010-10-12 08:54:54 -06:00
Brian Paul
3633e1f538
glsl2: fix signed/unsigned comparison warning
2010-10-12 08:54:16 -06:00
José Fonseca
e3ec0fdd54
llmvpipe: improve mm_mullo_epi32
...
Apply Jose's suggestions for a small but measurable improvement in
isosurf.
2010-10-12 14:17:21 +01:00
Thomas Hellstrom
b6b7ce84e5
st/xorg: Don't try to remove invalid fbs
...
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 15:09:05 +02:00
Thomas Hellstrom
201c3d3669
xorg/vmwgfx: Don't hide HW cursors when updating them
...
Gets rid of annoying cursor flicker
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 15:09:05 +02:00
Thomas Hellstrom
bfd065c71e
st/xorg: Add a customizer option to get rid of annoying cursor update flicker
...
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 15:09:05 +02:00
Thomas Hellstrom
f0bbf130f9
xorg/vmwgfx: Make vmwarectrl work also on 64-bit servers
...
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 15:09:04 +02:00
Thomas Hellstrom
ec08047a80
st/xorg: Don't try to use option values before processing options
...
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
2010-10-12 15:09:04 +02:00
Keith Whitwell
0ca0382d1b
Revert "llvmpipe: try to keep plane c values small"
...
This reverts commit 9773722c2b
.
Looks like there are some floor/rounding issues here that need
to be better understood.
2010-10-12 13:20:39 +01:00
Keith Whitwell
22ec25e2bf
gallivm: don't branch on KILLs near end of shader
2010-10-12 13:14:51 +01:00
Keith Whitwell
d0eb854f58
r600g: add missing file to sconscript
2010-10-12 13:08:34 +01:00
Keith Whitwell
1a574afabc
gallium: move sse intrinsics debug helpers to u_sse.h
2010-10-12 13:02:28 +01:00
José Fonseca
39331be44e
llvmpipe: Fix MSVC build.
...
MSVC doesn't accept more than 3 __m128i arguments.
2010-10-12 12:27:55 +01:00
Keith Whitwell
b4277bc584
llvmpipe: fix typo in last commit
2010-10-12 11:52:39 +01:00
Keith Whitwell
9773722c2b
llvmpipe: try to keep plane c values small
...
Avoid accumulating more and more fixed point bits.
2010-10-12 11:50:14 +01:00
Keith Whitwell
9d59e148f8
llvmpipe: add debug helpers for epi32 etc
2010-10-12 11:50:13 +01:00
Keith Whitwell
2cf98d5a6d
llvmpipe: try to do more of rast_tri_3_16 with intrinsics
...
There was actually a large quantity of scalar code in these functions
previously. This tries to move more into intrinsics.
Introduce an sse2 mm_mullo_epi32 replacement to avoid sse4 dependency
in the new rasterization code.
2010-10-12 11:50:07 +01:00
José Fonseca
4cb3b4ced8
llvmpipe: Do not dispose the execution engine.
...
The engine is a global owned by gallivm module.
2010-10-12 08:36:51 +01:00
Francisco Jerez
c25fcf5aa5
nouveau: Get larger push buffers.
...
Useful to amortize the command submission/reloc overhead (e.g. etracer
goes from 72 to 109 FPS on nv4b).
2010-10-12 04:13:22 +02:00
Francisco Jerez
70828aa246
dri/nouveau: Initialize tile_flags when allocating a render target.
2010-10-12 04:12:56 +02:00
Dave Airlie
965f69cb0c
r600g: fix typo in vertex sampling on r600
...
fixes https://bugs.freedesktop.org/show_bug.cgi?id=30771
Reported-by: Kevin DeKorte
2010-10-12 09:45:22 +10:00
Eric Anholt
bcec03d527
i965: Always use the new FS backend on gen6.
...
It's now much more correct for gen6 than the old backend, with just 2
regressions I've found (one of which is common with pre-gen6 and will
be fixed by an array splitting IR pass).
This does leave the old Mesa IR backend getting used still when we
don't have GLSL IR, but the plan is to get GLSL IR input to the driver
for the ARB programs and fixed function by the next release.
2010-10-11 15:32:41 -07:00
Eric Anholt
0cadd32b6d
i965: Fix gen6 pixel_[xy] setup to avoid mixing int and float src operands.
...
Pre-gen6, you could mix int and float just fine. Now, you get goofy
results.
Fixes:
glsl-arb-fragment-coord-conventions
glsl-fs-fragcoord
glsl-fs-if-greater
glsl-fs-if-greater-equal
glsl-fs-if-less
glsl-fs-if-less-equal
2010-10-11 15:26:59 -07:00
Eric Anholt
17306c60ad
i965: Don't compute-to-MRF in gen6 VS math.
...
There was code to do this for pre-gen6 already, this just enables it
for gen6 as well.
2010-10-11 15:26:59 -07:00
Eric Anholt
720ed3c906
i965: Expand uniform args to gen6 math to full registers to get hstride == 1.
...
This is a hw requirement in math args. This also is inefficient, as
we're calculating the same result 8 times, but then we've been doing
that on pre-gen6 as well. If we're doing math on uniforms, though,
we'd probably be better served by having some sort of mechanism for
precalculating those results into another uniform value to use.
Fixes 7 piglit math tests.
2010-10-11 15:26:58 -07:00
Eric Anholt
317dbf4613
i965: Don't compute-to-MRF in gen6 math instructions.
2010-10-11 15:26:58 -07:00
Eric Anholt
7b5bc38c44
i965: Add a couple of checks for gen6 math instruction limits.
2010-10-11 15:26:58 -07:00
Eric Anholt
25cf241540
i965: Don't consider gen6 math instructions to write to MRFs.
...
This was leftover from the pre-gen6 cleanups. One tests regresses
where compute-to-MRF now occurs.
2010-10-11 15:26:58 -07:00
Chad Versace
41c2079855
glsl: Changes in generated file glsl_lexer.cpp
...
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
2010-10-11 14:25:53 -07:00
Chad Versace
0c9fef6111
glsl: Add lexer rules for uint and uvecN (N=2..4)
...
Commit for generated file glsl_lexer.cpp follows this commit.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2010-10-11 14:25:48 -07:00
Chad Versace
fc99a3beb9
glsl: Add glsl_type::uvecN_type for N=2,3
...
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2010-10-11 14:25:44 -07:00
Chad Versace
a34817917b
intel_extensions: Add ability to set GLSL version via environment
...
Add ability to set the GLSL version used by the GLcontext by setting the
environment variable INTEL_GLSL_VERSION. For example,
env INTEL_GLSL_VERSION=130 prog args
If the environment variable is missing, the GLSL versions defaults to 120.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2010-10-11 14:25:30 -07:00
Daniel Vetter
603741a86d
r200: revalidate after radeon_update_renderbuffers
...
By calling radeon_draw_buffers (which sets the necessary flags
in radeon->NewGLState) and revalidating if NewGLState is non-zero
in r200TclPrimitive. This fixes an assert in libdrm (the color-/
depthbuffer was changed but not yet validated) and and stops the
kernel cs checker from complaining about them (when they're too
small).
Thanks to Mario Kleiner for the hint to call radeon_draw_buffer
(instead of my half-broken hack).
v2: Also fix the swtcl r200 path.
Cc: Mario Kleiner <mario.kleiner@tuebingen.mpg.de>
Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch>
2010-10-11 15:16:44 -04:00
Eric Anholt
c6dbf253d2
i965: Compute to MRF in the new FS backend.
...
This didn't produce a statistically significant performance difference
in my demo (n=4) or nexuiz (n=3), but it still seems like a good idea
and is recommended by the HW team.
2010-10-11 12:08:13 -07:00
Eric Anholt
06fd639c51
i965: Give the FB write and texture opcodes the info on base MRF, like math.
2010-10-11 12:07:33 -07:00
Eric Anholt
0cd6cea8a3
i965: Give the math opcodes information on base mrf/mrf len.
...
This is progress towards enabling a compute-to-MRF pass.
2010-10-11 12:03:34 -07:00
Eric Anholt
37758fb1cb
i965: Move FS backend structures to a header.
...
It's time to start splitting some of this up.
2010-10-11 11:52:02 -07:00
Eric Anholt
251fe27854
i965: Reduce register interference checks for changed FS_OPCODE_DISCARD.
...
While I don't know of any performance changes from this (once extra
reg available out of 128), it makes the generated asm a lot cleaner
looking.
2010-10-11 11:52:01 -07:00
Eric Anholt
90c4022040
i965: Split FS_OPCODE_DISCARD into two steps.
...
Having the single opcode write then read the reg meant that single
instruction opcodes had to consider their source regs to interfere
with their dest regs.
2010-10-11 11:52:01 -07:00
José Fonseca
986cb9d5cf
llvmpipe: Use lp_tgsi_info.
2010-10-11 13:06:25 +01:00
José Fonseca
7c1b5772a8
gallivm: More detailed analysis of tgsi shaders.
...
To allow more optimizations, in particular for direct textures.
2010-10-11 13:05:32 +01:00
José Fonseca
11dad21718
tgsi: Export some names for some tgsi enums.
...
Useful to give human legible names in other cases.
2010-10-11 13:05:31 +01:00
José Fonseca
6c1aa4fd49
gallium: Define C99 restrict keyword where absent.
2010-10-11 13:05:31 +01:00
José Fonseca
e1003336f0
gallivm: Eliminate unsigned integer arithmetic from texture coordinates.
...
SSE support for 32bit and 16bit unsigned arithmetic is not complete, and
can easily result in inefficient code.
In most cases signed/unsigned doesn't make a difference, such as for
integer texture coordinates.
So remove uint_coord_type and uint_coord_bld to avoid inefficient
operations to sneak in the future.
2010-10-11 08:14:09 +01:00
José Fonseca
b18fecbd0e
llvmpipe: Remove outdated comment about stencil testing.
2010-10-11 08:14:09 +01:00
Dave Airlie
3322416de4
r600g: don't run with scissors.
...
This could probably be done much nicer, I've spent a day chasing
a coherency problem in the kernel, that turned out to be incorrect
scissor setup.
2010-10-11 16:23:23 +10:00
Dave Airlie
ef2702fb20
r600g: add TXL opcode support.
...
fixes glsl1-2D Texture lookup with explicit lod (Vertex shader)
2010-10-11 12:18:05 +10:00
Dave Airlie
ea1d818b58
r600g: enable vertex samplers.
...
We need to move the texture sampler resources out of the range of the vertex attribs.
We could probably improve this using an allocator but this is the simple answer for now.
makes mesa-demos/src/glsl/vert-tex work.
2010-10-11 11:59:53 +10:00
Dave Airlie
2c47f302af
r600g: evergreen has no request size bit in texture word4
2010-10-11 11:59:53 +10:00
Dave Airlie
bd89da79a1
r600g: fix input/output Z export mixup for evergreen.
2010-10-11 11:59:53 +10:00
José Fonseca
17dbd41cf2
gallivm: Pass texture coords derivates as scalars.
...
We end up treating them as scalars in the end, and it saves some
instructions.
2010-10-10 19:51:35 +01:00
José Fonseca
693667bf88
gallivm: Use variables instead of Phis in loops.
...
With this commit all explicit Phi emission is now gone.
2010-10-10 19:05:05 +01:00
José Fonseca
48003f3567
gallivm: Allow to disable bri-linear filtering with GALLIVM_DEBUG=no_brilinear runtime option
2010-10-10 18:48:02 +01:00
José Fonseca
124adf253c
gallivm: Fix a long standing bug with nested if-then-else emission.
...
We can't patch true-block at end-if time, as there is no guarantee that
the block at the beginning of the true stanza is the same at the end of
the true stanza -- other control flow elements may have been emitted half
way the true stanza.
Although this bug surfaced recently with the commit to skip mip filtering
when lod is an integer the bug was always there, although probably it
was avoided until now: e.g., cubemap selection nests if-then-else on the
else stanza, which does not suffer from the same problem.
2010-10-10 18:48:02 +01:00
delphi
08f890d4c3
draw: some changes to allow for runtime changes to userclip planes
2010-10-10 08:40:11 +01:00
Francisco Jerez
e2acc7be26
dri/nv10: Fake fast Z clears for pre-nv17 cards.
2010-10-10 04:14:34 +02:00
Francisco Jerez
35a1893fd1
dri/nouveau: Minor cleanup.
2010-10-10 01:48:01 +02:00
José Fonseca
307df6a858
gallivm: Cleanup the rest of the flow module.
2010-10-09 21:39:14 +01:00
José Fonseca
d0ea464159
gallivm: Simplify if/then/else implementation.
...
No need for for a flow stack anymore.
2010-10-09 21:14:05 +01:00
José Fonseca
1949f8c315
gallivm: Factor out the SI->FP texture size conversion for SoA path too
2010-10-09 20:26:11 +01:00
José Fonseca
d45c379027
gallivm: Remove support for Phi generation.
...
Simply rely on mem2reg pass. It's easier and more reliable.
2010-10-09 20:14:03 +01:00
José Fonseca
ea7b49028b
gallivm: Use varilables instead of Phis for cubemap selection.
2010-10-09 19:53:21 +01:00
José Fonseca
cc40abad51
gallivm: Don't generate Phis for execution mask.
2010-10-09 12:55:31 +01:00
José Fonseca
679dd26623
gallivm: Special bri-linear computation path for unmodified rho.
2010-10-09 12:13:00 +01:00
José Fonseca
81a09c8a97
gallivm: Less code duplication in log computation.
2010-10-09 12:12:59 +01:00
José Fonseca
52427f0ba7
util: Defined M_SQRT2 when not available.
2010-10-09 12:12:59 +01:00
José Fonseca
53d7f5e107
gallivm: Handle code have ret correctly.
...
Stop disassembling on unconditional backwards jumps.
2010-10-09 12:12:59 +01:00
José Fonseca
edba53024f
llvmpipe: Fix MSVC build. Enable the new SSE2 code on non SSE3 systems.
2010-10-09 12:12:58 +01:00
Keith Whitwell
2de720dc8f
llvmpipe: simplified SSE2 swz/unswz routines
...
We've been using these in the linear path for a while now. Based on
Chris's SSSE3 code, but using only sse2 opcodes. Speed seems to be
identical, but code is simpler & removes dependency on SSE3.
Should be easier to extend to other rgba8 formats.
2010-10-09 12:12:58 +01:00
Keith Whitwell
5b7eb868fd
llvmpipe: clean up shader pre/postamble, try to catch more early-z
...
Specifically, can do early-depth-test even when alpahtest or
kill-pixel are active, providing we defer the actual z write until the
final mask is avaialable.
Improves demos/fire.c especially in the case where you get close to
the trees.
2010-10-09 11:44:45 +01:00
Keith Whitwell
aa4cb5e2d8
llvmpipe: try to be sensible about whether to branch after mask updates
...
Don't branch more than once in quick succession. Don't branch at the
end of the shader.
2010-10-09 11:44:45 +01:00
Keith Whitwell
2ef6f75ab4
gallivm: simpler uint8->float conversions
...
LLVM seems to finds it easier to reason about these than our
mantissa-manipulation code.
2010-10-09 11:44:45 +01:00
Keith Whitwell
c79f162367
gallivm: prefer blendvb for integer arguments
2010-10-09 11:44:45 +01:00
Keith Whitwell
d2cf757f44
gallivm: specialized x8z24 depthtest path
...
Avoid unnecessary masking of non-existant stencil component.
2010-10-09 11:44:09 +01:00
Keith Whitwell
954965366f
llvmpipe: dump fragment shader ir and asm when LP_DEBUG=fs
...
Better than GALLIVM_DEBUG if you're only interested in fragment shaders.
2010-10-09 11:43:23 +01:00
Keith Whitwell
6da29f3611
llvmpipe: store zero into all alloca'd values
...
Fixes slowdown in isosurf with earlier versions of llvm.
2010-10-09 11:43:23 +01:00
Keith Whitwell
40d7be5261
llvmpipe: use alloca for fs color outputs
...
Don't try to emit our own phi's, let llvm mem2reg do it for us.
2010-10-09 11:43:23 +01:00
Keith Whitwell
8009886b00
llvmpipe: defer attribute interpolation until after mask and ztest
...
Don't calculate 1/w for quads which aren't visible...
2010-10-09 11:42:48 +01:00
José Fonseca
d0bfb3c514
llvmpipe: Prevent z > 1.0
...
The current interpolation schemes causes precision loss.
Changing the operation order helps, but does not completely avoid the
problem.
The only short term solution is to clamp z to 1.0.
This is unfortunate, but probably unavoidable until interpolation is
improved.
2010-10-09 09:35:41 +01:00
José Fonseca
34c11c87e4
gallivm: Do size computations simultanously for all dimensions (AoS).
...
Operate simultanouesly on <width, height, depth> vector as much as possible,
instead of doing the operations on vectors with broadcasted scalars.
Also do the 24.8 fixed point scalar with integer shift of the texture size,
for unnormalized coordinates.
AoS path only for now -- the same thing can be done for SoA.
2010-10-09 09:34:31 +01:00
Zack Rusin
6316d54056
llvmpipe: fix rasterization of vertical lines on pixel boundaries
2010-10-09 08:19:21 +01:00
Vinson Lee
e7843363a5
i965: Initialize member variables.
...
Fixes these GCC warnings.
brw_wm_fp.c: In function 'search_or_add_const4f':
brw_wm_fp.c:92: warning: 'reg.Index2' is used uninitialized in this function
brw_wm_fp.c:84: note: 'reg.Index2' was declared here
brw_wm_fp.c:92: warning: 'reg.RelAddr2' is used uninitialized in this function
brw_wm_fp.c:84: note: 'reg.RelAddr2' was declared here
2010-10-08 16:40:29 -07:00
Vinson Lee
5abd498c47
i965: Silence unused variable warning on non-debug builds.
...
Fixes this GCC warning.
brw_vs.c: In function 'do_vs_prog':
brw_vs.c:46: warning: unused variable 'ctx'
2010-10-08 16:30:59 -07:00
Vinson Lee
978ffa1d61
i965: Silence unused variable warning on non-debug builds.
...
Fixes this GCC warning.
brw_eu_emit.c: In function 'brw_math2':
brw_eu_emit.c:1189: warning: unused variable 'intel'
2010-10-08 16:02:59 -07:00
Vinson Lee
220c0834a4
i915: Silence unused variable warning in non-debug builds.
...
Fixes this GCC warning.
i915_vtbl.c: In function 'i915_assert_not_dirty':
i915_vtbl.c:670: warning: unused variable 'dirty'
2010-10-08 15:49:02 -07:00
Roland Scheidegger
ff72c79924
gallivm: make use of new iround code in lp_bld_conv.
...
Only requires sse2 now.
2010-10-09 00:36:38 +02:00
Roland Scheidegger
175cdfd491
gallivm: optimize soa linear clamp to edge wrap mode a bit
...
Clamp against 0 instead of -0.5, which simplifies things.
The former version would have resulted in both int coords being zero
(in case of coord being smaller than 0) and some "unused" weight value,
whereas now the int coords will be 0 and 1, but weight will be 0, hence the
lerp should produce the same value.
Still not happy about differences between normalized and non-normalized...
2010-10-09 00:36:38 +02:00
Roland Scheidegger
2cc6da85d6
gallivm: avoid unnecessary URem in linear wrap repeat case
...
Haven't looked at what code this exactly generates but URem can't be fast.
Instead of using two URem only use one and replace the second one with
select/add (this is what the corresponding aos code already does).
2010-10-09 00:36:38 +02:00
Roland Scheidegger
318bb080b0
gallivm: more linear tex wrap mode calculation simplification
...
Rearrange order of operations a bit to make some clamps easier.
All calculations should be equivalent.
Note there seems to be some inconsistency in the clamp to edge case
wrt normalized/non-normalized coords, could potentially simplify this too.
2010-10-09 00:36:38 +02:00
Roland Scheidegger
99ade19e6e
gallivm: optimize some tex wrap mode calculations a bit
...
Sometimes coords are clamped to positive numbers before doing conversion
to int, or clamped to 0 afterwards, in this case can use itrunc
instead of ifloor which is easier. This is only the case for nearest
calculations unfortunately, except linear MIRROR_CLAMP_TO_EDGE which
for the same reason can use a unsigned float build context so the
ifloor_fract helper can reduce this to itrunc in the ifloor helper itself.
2010-10-09 00:36:38 +02:00
Roland Scheidegger
1e17e0c4ff
gallivm: replace sub/floor/ifloor combo with ifloor_fract
2010-10-09 00:36:37 +02:00
Roland Scheidegger
cb3af2b434
gallivm: faster iround implementation for sse2
...
sse2 supports round to nearest directly (or rather, assuming default nearest
rounding mode in MXCSR). Use intrinsic to use this rather than round (sse41)
or bit manipulation whenever possible.
2010-10-09 00:36:37 +02:00
Roland Scheidegger
0ed8c56bfe
gallivm: fix trunc/itrunc comment
...
trunc of -1.5 is -1.0 not 1.0...
2010-10-09 00:36:37 +02:00
Vinson Lee
0f4984a0fb
i915: Silence unused variable warning in non-debug builds.
...
Fixes this GCC warning.
i830_vtbl.c: In function 'i830_assert_not_dirty':
i830_vtbl.c:704: warning: unused variable 'i830'
2010-10-08 15:35:35 -07:00
Ian Romanick
0ea8b99332
glsl: Remove const decoration from inlined function parameters
...
The constness of the function parameter gets inlined with the rest of
the function. However, there is also an assignment to the parameter.
If this occurs inside a loop the loop analysis code will get confused
by the assignment to a read-only variable.
Fixes bugzilla #30552 .
NOTE: this is a candidate for the 7.9 branch.
2010-10-08 14:29:11 -07:00
Ian Romanick
dc459f8756
intel: Enable GL_ARB_explicit_attrib_location
2010-10-08 14:21:23 -07:00
Ian Romanick
dbc6c9672d
main: Enable GL_ARB_explicit_attrib_location for swrast
2010-10-08 14:21:23 -07:00
Ian Romanick
68a4fc9d5a
glsl: Add linker support for explicit attribute locations
2010-10-08 14:21:23 -07:00
Ian Romanick
eee68d3631
glsl: Track explicit location in AST to IR translation
2010-10-08 14:21:23 -07:00
Ian Romanick
2b45ba8bce
glsl: Regenerate files changes by previous commit
2010-10-08 14:21:23 -07:00
Ian Romanick
7f68cbdc4d
glsl: Add parser support for GL_ARB_explicit_attrib_location layouts
...
Only layout(location=#) is supported. Setting the index requires GLSL
1.30 and GL_ARB_blend_func_extended.
2010-10-08 14:21:22 -07:00
Ian Romanick
eafebed5bd
glcpp: Regenerate files changes by previous commit
2010-10-08 14:21:22 -07:00
Ian Romanick
e0c9f67be5
glcpp: Add the define for ARB_explicit_attrib_location when present
2010-10-08 14:21:22 -07:00
Ian Romanick
5ed6610d11
glsl: Regenerate files modified by previous commits
2010-10-08 14:21:22 -07:00
Ian Romanick
e24d35a5b5
glsl: Wrap ast_type_qualifier contents in a struct in a union
...
This will ease adding non-bit fields in the near future.
2010-10-08 14:21:22 -07:00
Ian Romanick
5ff4cfb788
glsl: Clear type_qualifier using memset
2010-10-08 14:21:22 -07:00
Ian Romanick
fd2aa7d313
glsl: Slight refactor of error / warning checking for ARB_fcc layout
2010-10-08 14:21:22 -07:00
Ian Romanick
dd93035a4d
glsl: Refactor 'layout' grammar to match GLSL 1.60 spec grammar
2010-10-08 14:21:22 -07:00
Ian Romanick
4b5489dd6f
glsl: Fail linking if assign_attribute_locations fails
2010-10-08 14:21:22 -07:00
Vinson Lee
3b16c591a4
r600g: Silence uninitialized variable warning.
2010-10-08 14:17:14 -07:00
Vinson Lee
36b65a373a
r600g: Silence uninitialized variable warning.
2010-10-08 14:14:16 -07:00
Vinson Lee
131485efae
r600g: Silence uninitialized variable warning.
2010-10-08 14:08:50 -07:00
Vinson Lee
5e90971475
gallivm: Remove unnecessary header.
2010-10-08 14:03:10 -07:00
Eric Anholt
c52a0b5c7d
i965: Add register coalescing to the new FS backend.
...
Improves performance of my GLSL demo 14.3% (+/- 4%, n=4) by
eliminating the moves used in ir_assignment and ir_swizzle handling.
Still 16.5% to go to catch up to the Mesa IR backend, presumably
because instructions are almost perfectly mis-scheduled now.
2010-10-08 13:22:27 -07:00
Eric Anholt
80c0077a6f
i965: Enable attribute swizzling (repositioning) in the gen6 SF.
...
We were trying to remap a fully-filled array down to only handing the
WM the components it uses. This is called attribute swizzling, and if
you don't enable it you just get 1:1 mappings of inputs to outputs.
This almost fixes glsl-routing, except for the highest gl_TexCoord[]
indices.
2010-10-08 12:00:04 -07:00
Eric Anholt
cac04a9397
i965: Fix new FS gen6 interpolation for sparsely-populated arrays.
...
We'd overwrite the same element twice.
2010-10-08 11:59:19 -07:00
Eric Anholt
624ce6f61b
i965: Fix gen6 WM push constants updates.
...
We would compute a new buffer, but never point the hardware at the new
buffer. This partially fixes glsl-routing, as now it get the updated
uniform for which attribute to draw.
2010-10-08 11:59:19 -07:00
José Fonseca
3fde8167a5
gallivm: Help for combined extraction and broadcasting.
...
Doesn't change generated code quality, but saves some typing.
2010-10-08 19:48:16 +01:00
José Fonseca
438390418d
llvmpipe: First minify the texture size, then broadcast.
2010-10-08 19:11:52 +01:00
José Fonseca
f5b5fb32d3
gallivm: Move into the as much of the second level code as possible.
...
Also, pass more stuff trhough the sample build context, instead of
arguments.
2010-10-08 19:11:52 +01:00
Eric Anholt
5b24d69fcd
i965: Handle swizzles in the addition of YUV texture constants.
...
If someone happened to land a set in a different swizzle order, we
would have assertion failed.
2010-10-08 10:24:30 -07:00
Eric Anholt
0534e958c9
i965: Drop the check for YUV constants in the param list.
...
_mesa_add_unnamed_constant() already does that.
2010-10-08 10:24:29 -07:00
Eric Anholt
fa8aba9da4
i965: Drop the check for duplicate _mesa_add_state_reference.
...
_mesa_add_state_reference does that check for us anyway.
2010-10-08 10:24:29 -07:00
Eric Anholt
e310c22bb7
mesa: Simplify a bit of _mesa_add_state_reference using memcmp.
2010-10-08 10:24:29 -07:00
José Fonseca
6b0c79e058
gallivm: Warn when doing inefficient integer comparisons.
2010-10-08 17:43:15 +01:00
José Fonseca
d5ef59d8b0
gallivm: Avoid control flow for two-sided stencil test.
2010-10-08 17:43:15 +01:00
Keith Whitwell
ef3407672e
llvmpipe: fix off-by-one in tri_16
2010-10-08 17:30:08 +01:00
Keith Whitwell
0ff132e5a6
llvmpipe: add rast_tri_4_16 for small lines and points
2010-10-08 17:30:08 +01:00
Keith Whitwell
eeb13e2352
llvmpipe: clean up setup_tri a little
2010-10-08 17:30:08 +01:00
Keith Whitwell
e191bf4a85
gallivm: round rather than truncate in new 4x4f->1x16ub conversion path
2010-10-08 17:30:08 +01:00
José Fonseca
f91b4266c6
gallivm: Use the wrappers for SSE pack intrinsics.
...
Fixes assertion failures on LLVM 2.6.
2010-10-08 17:30:08 +01:00
Keith Whitwell
607e3c542c
gallivm: special case conversion 4x4f to 1x16ub
...
Nice reduction in the number of operations required for final color
output in many shaders.
2010-10-08 17:30:08 +01:00
Keith Whitwell
29d6a1483d
llvmpipe: avoid overflow in triangle culling
...
Avoid multiplying fixed-point values. Calculate triangle area in
floating point use that for culling.
Lift area calculations up a level as we are already doing this in the
triangle_both() case.
Would like to share the calculated area with attribute interpolation,
but the way the code is structured makes this difficult.
2010-10-08 17:30:08 +01:00
Keith Whitwell
ad6730fadb
llvmpipe: fail gracefully on oom in scene creation
2010-10-08 17:26:29 +01:00
José Fonseca
eb605701aa
gallivm: Implement brilinear filtering.
2010-10-08 15:50:28 +01:00
José Fonseca
c8179ef5e8
gallivm: Fix copy'n'paste typo in previous commit.
2010-10-08 14:09:22 +01:00
José Fonseca
df7a2451b1
gallivm: Clamp mipmap level and zero mip weight simultaneously.
2010-10-08 14:06:38 +01:00
José Fonseca
0d84b64a4f
gallivm: Use lp_build_ifloor_fract for lod computation.
...
Forgot this one before.
2010-10-08 14:06:38 +01:00
José Fonseca
4f2e2ca4e3
gallivm: Don't compute the second mipmap level when frac(lod) == 0
2010-10-08 14:06:37 +01:00
José Fonseca
05fe33b71c
gallivm: Simplify lp_build_mipmap_level_sizes' interface.
2010-10-08 14:06:37 +01:00
José Fonseca
4eb222a3e6
gallivm: Do not do mipfiltering when magnifying.
...
If lod < 0, then invariably follows that ilevel0 == ilevel1 == 0.
2010-10-08 14:06:37 +01:00
Vinson Lee
1f01f5cfcf
r600g: Remove unnecessary header.
2010-10-08 04:56:49 -07:00
Dave Airlie
8d6a38d7b3
r600g: drop width/height per level storage.
...
these aren't used anywhere, so just waste memory.
2010-10-08 19:55:05 +10:00
Eric Anholt
bbb840049e
i965: Normalize cubemap coordinates like is done in the Mesa IR path.
...
Fixes glsl-fs-texturecube-2-*
2010-10-07 16:41:13 -07:00
Eric Anholt
4d202da7a4
i965: Disable emitting if () statements on gen6 until we really fix them.
2010-10-07 16:41:13 -07:00
Dave Airlie
1ae5cc2e67
r600g: add some RG texture format support.
2010-10-08 09:37:02 +10:00
Kristian Høgsberg
1d595c7cd4
gles2: Add GL_EXT_texture_format_BGRA8888 support
2010-10-07 17:08:50 -04:00
José Fonseca
321ec1a224
gallivm: Vectorize the rho computation.
2010-10-07 22:08:42 +01:00
Dave Airlie
51f9cc4759
r600g: fix Z export enable bits.
...
we should be checking output array not input to decide.
Signed-off-by: Dave Airlie <airlied@redhat.com>
2010-10-07 15:32:05 +10:00
Dave Airlie
97eea87bde
r600g: use format from the sampler view not from the texture.
...
we want to use the format from the sampler view which isn't always the
same as the texture format when creating sampler views.
2010-10-07 15:17:28 +10:00
Andre Maasikas
84457701b0
r600g: fix evergreen interpolation setup
...
interp data is stored in gpr0 so first interp overwrote it
and subsequent ones got wrong values
reserve register 0 so it's not used for attribs.
alternative is to interpolate attrib0 last (reverse, as r600c does)
2010-10-07 07:51:32 +03:00
Chia-I Wu
b2c0ef8b51
st/vega: Fix version check in context creation.
...
This fixes a regression since 4531356817
.
2010-10-07 12:15:31 +08:00
Chia-I Wu
da495ee870
targets/egl: Fix linking with libdrm.
2010-10-07 12:06:59 +08:00
Eric Anholt
d3163912c1
i965: Fix gen6 pointsize handling to match pre-gen6.
...
Fixes point-line-no-cull.
Bug #30532
2010-10-06 17:29:29 -07:00
Eric Anholt
b380531fd4
i965: Don't assume that WPOS is always provided on gen6 in the new FS.
...
We sensibly only provide it if the FS asks for it. We could actually
skip WPOS unless the FS needed WPOS.zw, but that's something for
later.
Fixes: glsl-texture2d and probably many others.
2010-10-06 12:13:08 -07:00
Eric Anholt
1fdc8c007e
i965: Add support for gl_FrontFacing on gen6.
...
Fixes glsl1-gl_FrontFacing var (2) with new FS.
2010-10-06 12:13:08 -07:00
Eric Anholt
a760b5b509
i965: Refactor gl_FrontFacing setup out of general variable setup.
2010-10-06 12:13:08 -07:00
Eric Anholt
75270f705f
i965: Gen6's sampler messages are the same as Ironlake.
...
This should fix texturing in the new FS backend.
2010-10-06 12:13:08 -07:00
Eric Anholt
fe6efc25ed
i965: Don't do 1/w multiplication in new FS for gen6
...
Not needed now that we're doing barycentric.
2010-10-06 12:13:08 -07:00
Eric Anholt
5d99b01501
i965: Add some clarification of the WECtrl field.
2010-10-06 12:13:08 -07:00
Eric Anholt
5eeaf3671e
i965: Fix botch in the header_present case in the new FS.
...
I only set it on the color_regions == 0 case, missing the important
case, causing GPU hangs on pre-gen6.
2010-10-06 12:13:08 -07:00
José Fonseca
9fe510ef35
llvmpipe: Cleanup depth-stencil clears.
...
Only cosmetic changes. No actual practical difference.
2010-10-06 19:08:21 +01:00
José Fonseca
33f88b3492
util: Cleanup util_pack_z_stencil and friends.
...
- Handle PIPE_FORMAT_Z32_FLOAT packing correctly.
- In the integer version z shouldn't be passed as as double.
- Make it clear that the integer versions should only be used for masks.
- Make integer type sizes explicit (uint32_t for now, although
uint64_t will be necessary later to encode f32_s8_x24).
2010-10-06 19:08:18 +01:00
José Fonseca
87dd859b34
gallivm: Compute lod as integer whenever possible.
...
More accurate/faster results for PIPE_TEX_MIPFILTER_NEAREST. Less
FP <-> SI conversion overall.
2010-10-06 18:51:25 +01:00
José Fonseca
1c32583581
gallivm: Only apply min/max_lod when necessary.
2010-10-06 18:50:57 +01:00
Keith Whitwell
5849a6ab64
gallivm: don't apply zero lod_bias
2010-10-06 18:49:32 +01:00
José Fonseca
af05f61576
gallivm: Combined ifloor & fract helper.
...
The only way to ensure we don't do redundant FP <-> SI conversions.
2010-10-06 18:47:01 +01:00
José Fonseca
012d57737b
gallivm: Fast implementation of iround(log2(x))
...
Not tested yet, but should be correct.
2010-10-06 18:46:59 +01:00
José Fonseca
4648846bd6
gallivm: Use a faster (and less accurate) log2 in lod computation.
2010-10-06 18:46:29 +01:00
José Fonseca
df3505b193
gallivm: Take the type signedness in consideration in round/ceil/floor.
2010-10-06 18:46:08 +01:00
Eric Anholt
feca660939
i965: Fix up IF/ELSE/ENDIF for gen6.
...
The jump delta is now in the part of the instruction where the
destination fields used to be, and the src args are ignored (or not,
for the new non-predicated IF that we don't use yet).
2010-10-06 10:09:45 -07:00
Eric Anholt
f7cb28fad9
i965: Gen6 no longer has the IFF instruction; always use IF.
2010-10-06 10:09:45 -07:00
Eric Anholt
3c97c00e38
i965: Add back gen6 headerless FB writes to the new FS backend.
...
It's not that hard to detect when we need the header.
2010-10-06 10:09:44 -07:00
Jerome Glisse
3fabd218a0
r600g: fix dirty state handling
...
Avoid having object ending up in dead list of dirty object.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-06 13:01:31 -04:00
Eric Anholt
634abbf7b2
i965: Also do constant propagation for the second operand of CMP.
...
We could do the first operand as well by flipping the comparison, but
this covered several CMPs in code I was looking at.
2010-10-06 09:33:26 -07:00
Eric Anholt
dcd0261aff
i965: Enable the constant propagation code.
...
A debug disable had slipped in.
2010-10-06 09:33:26 -07:00
Jerome Glisse
1644bb0f40
r600g: avoid segfault due to unintialized list pointer
...
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-06 09:41:19 -04:00
José Fonseca
06472ad7e8
llvmpipe: Fix sprite coord perspective interpolation of Q.
...
Q coordinate's coefficients also need to be multiplied by w, otherwise
it will have 1/w, causing problems with TXP.
2010-10-06 11:46:41 +01:00
José Fonseca
e74955eba3
llvmpipe: Fix perspective interpolation for point sprites.
...
Once a fragment is generated with LP_INTERP_PERSPECTIVE set for an input,
it will do a divide by w for that input. Therefore it's not OK to treat LP_INTERP_PERSPECTIVE as
LP_INTERP_LINEAR or vice-versa, even if the attribute is known to not
vary.
A better strategy would be to take the primitive in consideration when
generating the fragment shader key, and therefore avoid the per-fragment
perspective divide.
2010-10-06 11:44:59 +01:00
José Fonseca
446dbb9217
llvmpipe: Dump a few missing shader key flags.
2010-10-06 11:41:08 +01:00
Keith Whitwell
591e1bc34f
llvmpipe: make debug_fs_variant respect variant->nr_samplers
2010-10-06 11:40:30 +01:00
José Fonseca
5661e51c01
retrace: Handle clear_render_target and clear_depth_stencil.
2010-10-06 11:37:49 +01:00
Dave Airlie
9528fc2107
r600g: add evergreen stencil support.
...
this sets the stencil up for evergreen properly.
2010-10-06 09:21:16 +10:00
Jerome Glisse
ea5a74fb58
r600g: userspace fence to avoid kernel call for testing bo busy status
...
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-05 17:04:25 -04:00
Brian Paul
3d6eec0a87
st/mesa: replace assertion w/ conditional in framebuffer invalidation
...
https://bugs.freedesktop.org/show_bug.cgi?id=30632
NOTE: this is a candidate for the 7.9 branch.
2010-10-05 14:33:17 -06:00
Jerome Glisse
2cf3199ee3
r600g: simplify block relocation
...
Since flush rework there could be only one relocation per
register in a block.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-05 15:23:07 -04:00
Bas Nieuwenhuizen
ac8a1ebe55
r600g: use dirty list to track dirty blocks
...
Got a speed up by tracking the dirty blocks in a seperate list instead of looping through all blocks. This version should work with block that get their dirty state disabled again and I added a dirty check during the flush as some blocks were already dirty.
2010-10-05 15:16:06 -04:00
Nicolas Kaiser
71fd35d1ad
nv50: fix always true conditional in shader optimization
2010-10-05 18:53:15 +02:00
Jerome Glisse
585e4098aa
r600g: improve bo flushing
...
Flush read cache before writting register. Track flushing inside
of a same cs and avoid reflushing same bo if not necessary. Allmost
properly force flush if bo rendered too and then use as a texture
in same cs (missing pipeline flush dunno if it's needed or not).
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-05 10:43:23 -04:00
Jerome Glisse
12d16e5f14
r600g: store reloc information in bo structure
...
Allow fast lookup of relocation information & id which
was a CPU time consumming operation.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-05 10:42:56 -04:00
Dave Airlie
bf21b7006c
pb: fix numDelayed accounting
...
we weren't decreasing when removing from the list.
2010-10-05 19:08:41 +10:00
Dave Airlie
12be1568d0
r600g: avoid unneeded bo wait
...
if we know the bo has gone not busy, no need to add another bo wait
thanks to Andre (taiu) on irc for pointing this out.
2010-10-05 16:00:48 +10:00
Dave Airlie
d2c06b5037
r600g: drop use_mem_constant.
...
since we plan on using dx10 constant buffers everywhere.
2010-10-05 16:00:23 +10:00
Dave Airlie
46997d4fc2
r600g: drop mman allocator
...
we don't use this since constant buffers are now being used on all gpus.
2010-10-05 15:57:57 +10:00
Dave Airlie
05813ad5f4
r600g: add bo busy backoff.
...
When we go to do a lot of bos in one draw like constant bufs we need
to avoid bouncing off the busy ioctl, this mitigates by backing off
on busy bos for a short amount of times.
2010-10-05 15:51:38 +10:00
Dave Airlie
49866c8f34
pb: don't keep checking buffers after first busy
...
If we assume busy buffers are added to the list in order its unlikely
we'd fine one after the first busy one that isn't busy.
2010-10-05 15:50:58 +10:00
Dave Airlie
3c38e4f138
r600g: add bo fenced list.
...
this just keeps a list of bos submitted together, and uses them to decide
bo busy state for the whole group.
2010-10-05 15:35:52 +10:00
Brian Paul
fb5e6f88fc
swrast: fix choose_depth_texture_level() to respect mipmap filtering state
...
NOTE: this is a candidate for the 7.9 branch.
2010-10-04 19:59:46 -06:00
Marek Olšák
d0408cf55d
r300g: fix microtiling for 16-bits-per-channel formats
...
These texture formats (like R16G16B16A16_UNORM) were untested until now
because st/mesa doesn't use them. I am testing this with a hacked st/mesa
here.
2010-10-05 02:57:00 +02:00
Eric Anholt
ea909be58d
i965: Add support for gen6 FB writes to the new FS.
...
This uses message headers for now, since we'll need it for MRT. We
can cut out the header later.
2010-10-04 16:08:17 -07:00
Eric Anholt
739aec39bd
i965: In disasm, gen6 fb writes don't put msg reg # in destreg_conditionalmod.
...
It instead sensibly appears in the src0 slot.
2010-10-04 16:08:17 -07:00
Eric Anholt
3bf8774e9c
i965: Add initial folding of constants into operand immediate slots.
...
We could try to detect this in expression handling and do it
proactively there, but it seems like less logic to do it in one
optional pass at the end.
2010-10-04 16:08:17 -07:00
Eric Anholt
e27c88d8e6
i965: Add trivial dead code elimination in the new FS backend.
...
The glsl core should be handling most dead code issues for us, but we
generate some things in codegen that may not get used, like the 1/w
value or pixel deltas. It seems a lot easier this way than trying to
work out up front whether we're going to use those values or not.
2010-10-04 16:08:17 -07:00
Eric Anholt
9faf64bc32
i965: Be more conservative on live interval calculation.
...
This also means that our intervals now highlight dead code.
2010-10-04 16:08:17 -07:00
Vinson Lee
a0a8e24385
r600g: Fix SCons build.
2010-10-04 15:56:55 -07:00
Jerome Glisse
b25c52201b
r600g: remove dead label & fix indentation
...
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-04 17:25:19 -04:00
Jerome Glisse
243d6ea609
r600g: rename radeon_ws_bo to r600_bo
...
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-04 17:25:19 -04:00
Jerome Glisse
674452faf9
r600g: use r600_bo for relocation argument, simplify code
...
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-04 17:25:19 -04:00
Jerome Glisse
d22a1247d8
r600g: allow r600_bo to be a sub allocation of a big bo
...
Add bo offset everywhere needed if r600_bo is ever a sub bo
of a bigger bo.
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-04 17:25:19 -04:00
Jerome Glisse
294c9fce1b
r600g: rename radeon_ws_bo to r600_bo
...
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-04 17:25:19 -04:00
delphi
25bb05fef0
draw: added userclip planes and updated variant_key
2010-10-04 22:08:16 +01:00
Krzysztof Smiechowicz
68c7994ab5
nvfx: Pair os_malloc_aligned() with os_free_aligned().
...
From AROS.
2010-10-04 11:43:29 -07:00
Dave Airlie
3d45d57044
r600g: TODO domain management
...
no wonder it was slow, the code is deliberately forcing stuff into GTT,
we used to have domain management but it seems to have disappeared.
2010-10-04 16:41:49 +10:00
Dave Airlie
1c2b3cb1e9
r600g: fix wwarning in bo_map function
2010-10-04 16:26:46 +10:00
Dave Airlie
6dc051557d
r600g: the code to check whether a new vertex shader is needed was wrong
...
this code was memcmp'ing two structs, but refcounting one of them afterwards,
so any subsequent memcmp was never going to work.
again this stops unnecessary uploads of vertex program,
2010-10-04 16:24:59 +10:00
Dave Airlie
92aba9c1f5
r600g: break out of search for reloc bo after finding it.
...
this function was taking quite a lot of pointless CPU.
2010-10-04 15:58:39 +10:00
Eric Anholt
14bf92ba19
i965: Fix glean/texSwizzle regression in previous commit.
...
Easy enough patch, who needs a full test run. Oh, that's right. Me.
2010-10-03 00:24:09 -07:00
Eric Anholt
a7fa00dfc5
i965: Set up swizzling of shadow compare results for GL_DEPTH_TEXTURE_MODE.
...
The brw_wm_surface_state.c handling of GL_DEPTH_TEXTURE_MODE doesn't
apply to shadow compares, which always return an intensity value. The
texture swizzles can do the job for us.
Fixes:
glsl1-shadow2D(): 1
glsl1-shadow2D(): 3
2010-10-02 23:48:14 -07:00
Eric Anholt
4fb0c92c69
i965: Add support for EXT_texture_swizzle to the new FS backend.
2010-10-02 23:44:44 -07:00
Marek Olšák
8f7177e0de
r300g: add support for L8A8 colorbuffers
...
Blending with DST_ALPHA is undefined. SRC_ALPHA works, though.
I bet some other formats have similar limitations too.
2010-10-02 23:19:38 +02:00
Marek Olšák
e75bce026c
r300g: add support for R8G8 colorbuffers
...
The hw swizzles have been obtained by a brute force approach,
and only C0 and C2 are stored in UV88, the other channels are
ignored.
R16G16 is going to be a lot trickier.
2010-10-02 21:42:22 +02:00
Dave Airlie
71a079fb4e
mesa/st: initial attempt at RG support for gallium drivers
...
passes all piglit RG tests with softpipe.
2010-10-02 17:03:15 +10:00
Kenneth Graunke
f317713432
i965: Fix incorrect batchbuffer size in gen6 clip state command.
...
FORCE_ZERO_RTAINDEX should be in the fourth (and final) dword.
2010-10-01 21:53:28 -07:00
Eric Anholt
64a9fc3fc1
i965: Don't try to emit code if we failed register allocation.
2010-10-01 17:19:04 -07:00
Eric Anholt
6397addd61
i965: Fix off-by-ones in handling the last members of register classes.
...
Luckily, one of them would result in failing out register allocation
when the other bugs were encountered. Applies to
glsl-fs-vec4-indexing-temp-dst-in-nested-loop-combined, which still
fails register allocation, but now legitimately.
2010-10-01 17:19:04 -07:00
Eric Anholt
afb64311e3
i965: Add a sanity check for register allocation sizes.
2010-10-01 17:19:03 -07:00
Eric Anholt
5ee0941316
i965: When producing a single channel swizzle, don't make a temporary.
...
This quickly cuts 8% of the instructions in my glsl demo.
2010-10-01 17:19:03 -07:00
Eric Anholt
a0799725f5
i965: Restore the forcing of aligned pairs for delta_xy on chips with PLN.
...
By doing so using the register allocator now, we avoid wasting a
register to make the alignment happen.
2010-10-01 17:19:03 -07:00
Alex Deucher
fb0eed84ca
r600c: fix segfault in evergreen stencil code
...
Fixes:
https://bugs.freedesktop.org/show_bug.cgi?id=30551
2010-10-01 20:14:25 -04:00
Vinson Lee
7af2a22d1f
r600g: Remove unnecessary headers.
2010-10-01 17:06:33 -07:00
Vinson Lee
20846a8ce1
r600g: Remove unused variable.
...
Fixes this GCC warning.
r600_shader.c: In function 'tgsi_split_literal_constant':
r600_shader.c:818: warning: unused variable 'index'
2010-10-01 17:02:01 -07:00
Ian Romanick
1ca6cbec1b
rgtc: Detect RGTC formats as color formats and as compressed formats
2010-10-01 16:55:35 -07:00
Ian Romanick
5ebbabc5cc
mesa: Trivial correction to comment
2010-10-01 16:55:35 -07:00
Ian Romanick
69c78bf2c2
mesa: Fix misplaced #endif
...
If FEATURE_texture_s3tc is not defined, FXT1 formats would erroneously
fall through to the MESA_FORMAT_RGBA_FLOAT32 case.
2010-10-01 16:55:35 -07:00
Ian Romanick
7c6147014a
ARB_texture_rg: Add GL_COMPRESSED_{RED,RG} cases in _mesa_is_color_format
2010-10-01 16:55:35 -07:00
Ian Romanick
e2a054b70c
mesa: Add ARB_texture_compression_rgtc as an alias for EXT_texture_compression_rgtc
...
Change the name in the extension tracking structure to ARB (from EXT).
2010-10-01 16:55:35 -07:00
Vinson Lee
e5fd15199d
savage: Remove unnecessary header.
2010-10-01 16:57:19 -07:00
Vinson Lee
841503fddf
glsl: Remove unnecessary header.
2010-10-01 16:27:58 -07:00
Ian Romanick
c77cd9ec10
i965: Enable GL_ARB_texture_rg
2010-10-01 15:49:13 -07:00
Ian Romanick
9ef390dc14
mesa: Enable GL_ARB_texture_rg in software paths
2010-10-01 15:49:13 -07:00
Ian Romanick
421f4d8dc1
ARB_texture_rg: Allow RED and RG textures as FBO color buffer attachments
2010-10-01 15:49:13 -07:00
Ian Romanick
5d1387b2da
ARB_texture_rg: Add R8, R16, RG88, and RG1616 internal formats
2010-10-01 15:49:13 -07:00
Ian Romanick
214a33f610
ARB_texture_rg: Handle RED and RG the same as RGB for tex env
2010-10-01 15:49:13 -07:00
Ian Romanick
cd5dea6401
ARB_texture_rg: Add GL_RED as a valid GL_DEPTH_TEXTURE_MODE
2010-10-01 15:49:13 -07:00
Ian Romanick
cc6f13def5
ARB_texture_rg: Add GL_TEXTURE_{RED,GREEN}_SIZE query support
2010-10-01 15:49:12 -07:00
Ian Romanick
3ebbc176f9
ARB_texture_rg: Correct some errors in RED / RG internal format handling
...
Fixes several problems:
The half-float, float, and integer internal formats depend on
ARB_texture_rg and other extensions.
RG_INTEGER is not a valid internal format.
Generic compressed formats depend on ARB_texture_rg, not
EXT_texture_compression_rgtc.
Use GL_RED instead of GL_R.
2010-10-01 15:49:12 -07:00
Ian Romanick
bb45ab0a96
ARB_texture_rg: Add GLX protocol support
2010-10-01 15:49:12 -07:00
Nicolas Kaiser
96efa8a923
i965g: use Elements macro instead of manual sizeofs
...
Signed-off-by: Nicolas Kaiser <nikai@nikai.net>
Signed-off-by: Brian Paul <brianp@vmware.com>
2010-10-01 16:41:13 -06:00
Eric Anholt
e9bcc83289
i965: Fix up copy'n'pasteo from moving coordinate setup around for gen4.
2010-10-01 14:09:00 -07:00
Eric Anholt
bfd9715c3c
i965: Add real support for pre-gen5 texture sampling to the new FS.
...
Fixes 36 testcases, including glsl-fs-shadow2d*-bias which fail on the
Mesa IR backend.
2010-10-01 14:02:48 -07:00
richard
92eb07a281
evergreen : fix z format setting, enable stencil.
2010-10-01 16:10:02 -04:00
Eric Anholt
8f63a44636
i965: Pre-gen6, map VS outputs (not FS inputs) to URB setup in the new FS.
...
We should fix the SF to actually give us just the data we need, but
this fixes regressions in the new FS until then.
Fixes:
glsl-kwin-blur
glsl-routing
2010-10-01 12:21:51 -07:00
Eric Anholt
ff5ce9289b
i965: Also increment attribute location when skipping unused slots.
...
Fixes glsl1-texcoord varying.
2010-10-01 12:19:21 -07:00
Eric Anholt
354c40a624
i965: Fix the gen6 jump size for BREAK/CONT in new FS.
...
Since gen5, jumps are in increments of 64 bits instead of increments
of 128-bit instructions.
2010-10-01 12:19:21 -07:00
Eric Anholt
efc4a6f790
i965: Add gen6 attribute interpolation to new FS backend.
...
Untested, since my hardware is not booting at the moment.
2010-10-01 12:19:21 -07:00
Jerome Glisse
29b491bd03
r600g: indentation fixes
...
Signed-off-by: Jerome Glisse <jglisse@redhat.com>
2010-10-01 10:26:58 -04:00