Commit Graph

156 Commits

Author SHA1 Message Date
Roland Scheidegger 7b89fcec41 llvmpipe: improve rasterization discard logic
This unifies the explicit rasterization discard as well as the implicit
rasterization disabled logic (which we need for another state tracker),
which really should do the exact same thing.
We'll now toss out the prims early on in setup with (implicit or
explicit) discard, rather than do setup and binning with them, which
was entirely pointless.
(We should eventually get rid of implicit discard, which should also
enable us to discard stuff already in draw, hence draw would be
able to skip the pointless clip and fallback stages in this case.)
We still need separate logic for only null ps - this is not the same
as rasterization discard. But simplify the logic there and don't count
primitives simply when there's an empty fs, regardless of depth/stencil
tests, which seems perfectly acceptable by d3d10.
While here, also fix statistics for primitives if face culling is
enabled.
No piglit changes.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2018-05-23 04:23:32 +02:00
Roland Scheidegger c7688d2de5 llvmpipe:fix using 32bit rasterization mistakenly, causing overflows
We use the bounding box (triangle extents) to figure out if 32bit rasterization
could potentially overflow. However, we used the bounding box which already got
rounded up to 0 for negative coords for this, which is incorrect, leading to
overflows and hence bogus rendering in some of our private use.

It might be possible to simplify this somehow (we're now using 3 different
boxes for binning) but I don't quite see how.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-23 19:39:29 +02:00
Roland Scheidegger 672d245ffe llvmpipe: fill in debug vertex info for tri rasterization
This is pretty useful for debugging rasterization issues, so turn it on
based on DEBUG (the actual existence of the fields is also conditionalized
on DEBUG, lines fill it out the same too).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2017-06-23 19:39:29 +02:00
Samuel Pitoiset fbe2ff7740 llvmpipe: remove unused subpixel_snap() and fixed_to_float()
Fixes the following Clang warnings.

lp_setup_tri.c:55:1: warning: unused function 'subpixel_snap' [-Wunused-function]
subpixel_snap(float a)
^
lp_setup_tri.c:61:1: warning: unused function 'fixed_to_float' [-Wunused-function]
fixed_to_float(int a)
^

v2: - do not remove subpixel_snap() (use !PIPE_ARCH_SSE instead)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2017-04-13 10:06:05 +02:00
Roland Scheidegger bb2c5e657b llvmpipe: fix lp_rast_plane alignment on 32bit
Some rasterization code relies (for sse) on the first and third planes
(but not the second for now) being 128bit aligned, and we didn't get that
on 32bit - I mistakenly thought the 64bit number in the struct would get
the thing aligned to 64bit even on 32bit archs.
Stephane Marchesin really figured this out.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>

CC: <mesa-stable@lists.freedesktop.org>
2016-03-15 19:42:15 +01:00
Brian Paul 71dcc067a5 llvmpipe: add a few const qualifiers
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-02-19 08:51:51 -07:00
Roland Scheidegger 848a023c05 llvmpipe: use scissor_planes_needed helper function
So it doesn't get out of sync in multiple places.
2016-02-03 01:25:45 +01:00
Roland Scheidegger 99bd96abbb llvmpipe: drop scissor planes early if the tri is fully inside them
If the tri is fully inside a scissor edge (or rather, we just use the
bounding box of the tri for the comparison), then we can drop these
additional scissor "planes" early. We do not even need to allocate
space for them in the tri.

The math actually appears to be slightly iffy due to bounding boxes
being rounded, but it doesn't matter in the end.

Those scissor rects are costly - the 4 planes from the scissor are
already more expensive to calculate than the 3 planes from the tri itself,
and it also prevents us from using the specialized raster code for small
tris.

This helps openarena performance by about 8% or so. Of course, it helps
there that while openarena often enables scissoring (and even moves the
scissor rect around) I have not seen a single tri actually hit the
scissor rect, ever.

v2: drop individual scissor edges, and do it earlier, not even allocating
space for them.
v3: help the compiler a bit with simpler code, suggested by Brian.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-02 05:58:19 +01:00
Roland Scheidegger 9d2a34e105 llvmpipe: minor cleanup of sse2 for calc_fixed_position
Just slightly simpler assembly.

Reviewed-by: Brian Paul <brianp@vmware.com>
2016-02-02 05:58:19 +01:00
Oded Gabbay 529aa8249a llvmpipe: fix arguments order given to vec_andc
This patch fixes a classic "confuse the enemy" bug.

_mm_andnot_si128 (SSE) and vec_andc (VMX) do the same operation, but the
arguments are opposite.

_mm_andnot_si128 performs "r = (~a) & b" while
vec_andc performs "r = a & (~b)"

To make sure this error won't return in another place, I added a wrapper
function, vec_andnot_si128, in u_pwr8.h, which makes the swap inside.

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-01-17 21:07:27 +02:00
Roland Scheidegger 38cdcb000d llvmpipe: (trivial) use cast wrapper for __m128d to __m128 casts
some compiler was unhappy.
2016-01-13 04:48:41 +01:00
Roland Scheidegger 16530fdc82 llvmpipe: scale up bounding box planes to subpixel precision
Otherwise some planes we get in rasterization have subpixel precision, others
not. Doesn't matter so far, but will soon. (OpenGL actually supports viewports
with subpixel accuracy, so could even do bounding box calcs with that).

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-13 03:34:59 +01:00
Roland Scheidegger 0298f5aca7 llvmpipe: add sse code for fixed position calculation
This is quite a few less instructions, albeit still do the 2 64bit muls
with scalar c code (they'd need way more shuffles, plus fixup for the signed
mul so it totally doesn't seem worth it - x86 can do 32x32->64bit signed
scalar muls natively just fine after all (even on 32bit).

(This still doesn't have a very measurable performance impact in reality,
although profiler seems to say time spent in setup indeed has gone down by
10% or so overall. Maybe good for a 3% or so improvement in openarena.)

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2016-01-13 03:34:09 +01:00
Roland Scheidegger 2923c7a0ed llvmpipe: do 64bit plane calculations in the sse path
The sse path was pretty much disabled for practical purposes because the
largest allowed fb size was 128x128. So, adapt it for 64bit plane calculations.
This is actually not that difficult, though a problem is that we can't do
a signed 32x32->64bit mul, only unsigned, so need to fix that up. Overall,
the code still looks reasonable, though it's not like changes there in
setup really make much of a difference in the end...

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-01-08 00:34:14 +01:00
Roland Scheidegger fad283ba9e llvmpipe: don't store eo as 64bit int
eo, just like dcdx and dcdy, cannot overflow 32bit.
Store it as unsigned though just in case (it cannot be negative, but
in theory twice as big as dcdx or dcdy so this gives it one more bit).
This doesn't really change anything, albeit it might help minimally on
32bit archs.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2016-01-08 00:34:14 +01:00
Roland Scheidegger b61b9a377e llvmpipe: use aligned data for the assembly program in setup
Back in the day (before 24678700ed) the values
were not actually in a struct but even then I can't see why we didn't simply
align the values. Especially since it's trivial to do so.
(Not that it actually matters since the code is pretty much unused for now.)

Reviewed-by: Oded Gabbay <oded.gabbay@gmail.com>
2016-01-08 00:34:13 +01:00
Oded Gabbay 3bbe16ea79 llvmpipe: Optimize do_triangle_ccw for POWER8
This patch converts the SSE optimization done in do_triangle_ccw to
VMX/VSX.

I measured the results on POWER8 machine with 32 cores at 3.4GHz and
16GB of RAM.

                      FPS/Score
  Name            Before     After    Delta
------------------------------------------------
glmark2 (score)   136.6      139.8    2.34%
openarena         16.14      16.35    1.30%
xonotic           4.655      4.707    1.11%

v2:

- Convert loads to use aligned loads
- Make sure code is build only on POWER8 LE machine

Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2016-01-06 14:54:16 +02:00
Edward O'Callaghan 13eb5f596b gallium/drivers: Sanitize NULL checks into canonical form
Use NULL tests of the form `if (ptr)' or `if (!ptr)'.
They do not depend on the definition of the symbol NULL.
Further, they provide the opportunity for the accidental
assignment, are clear and succinct.

Signed-off-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-12-06 17:10:23 +01:00
Roland Scheidegger ddaf8d7b10 llvmpipe: use provoking vertex for layer/viewport
d3d10 actually requires using provoking (first) vertex. GL is happy with
any vertex (as long as we say it's undefined in the corresponding queries).
Up to now we actually used vertex 0 for viewport index, and vertex 1 for
layer (for tris), which really didn't make sense (probably a typo). Also,$
since we reorder vertices of clockwise triangle, that actually meant we used
a different vertex depending if the traingle was cw or ccw (still ok by gl).
However, it should be consistent with what draw (clip) does, and using
provoking vertex seems like the sensible choice (draw clip will be fixed
next as it is totally broken there).
While here, also use the correct viewport always even when not needed
in setup (we pass it down to jit fragment shader it might be needed there
for getting correct near/far depth values).

No piglit changes.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2015-12-04 03:42:19 +01:00
Ilia Mirkin a2a1a5805f gallium: replace INLINE with inline
Generated by running:
git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g'
git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g'
git checkout src/gallium/state_trackers/clover/Doxyfile

and manual edits to
src/gallium/include/pipe/p_compiler.h
src/gallium/README.portability

to remove mentions of the inline define.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Acked-by: Marek Olšák <marek.olsak@amd.com>
2015-07-21 17:52:16 -04:00
José Fonseca a0ddc54777 draw,gallivm,llvmpipe: Avoid implicit casts of 32-bit shifts to 64-bits.
Addresses MSVC warnings "result of 32-bit shift implicitly converted to
64 bits (was 64-bit shift intended?)", which can often be symptom of
bugs, but in these cases were all benign.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-11-26 20:25:12 +00:00
José Fonseca 8771285054 s/Tungsten Graphics/VMware/
Tungsten Graphics Inc. was acquired by VMware Inc. in 2008.  Leaving the
old copyright name is creating unnecessary confusion, hence this change.

This was the sed script I used:

    $ cat tg2vmw.sed
    # Run as:
    #
    #   git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed
    #

    # Rename copyrights
    s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g
    /Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./
    s/TUNGSTEN GRAPHICS/VMWARE/g

    # Rename emails
    s/alanh@tungstengraphics.com/alanh@vmware.com/
    s/jens@tungstengraphics.com/jowen@vmware.com/g
    s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/
    s/jrfonseca\?@tungstengraphics.com/jfonseca@vmware.com/g
    s/keithw\?@tungstengraphics.com/keithw@vmware.com/g
    s/michel@tungstengraphics.com/daenzer@vmware.com/g
    s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/
    s/zack@tungstengraphics.com/zackr@vmware.com/

    # Remove dead links
    s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g

    # C string src/gallium/state_trackers/vega/api_misc.c
    s/"Tungsten Graphics, Inc"/"VMware, Inc"/

Reviewed-by: Brian Paul <brianp@vmware.com>
2014-01-17 20:00:32 +00:00
José Fonseca 37de6b0682 llvmpipe: Respect bottom_edge_rule when computing the rasterization bounding boxes.
This was inadvertently forgotten when replacing gl_rasterization_rules
with lower_left_origin and half_pixel_center (commit
2737abb44e).

This makes a difference when lower_left_origin != half_pixel_center, e.g,
D3D10.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Zack Rusin <zackr@vmware.com>
2014-01-08 12:18:17 +00:00
Roland Scheidegger bfcf1ba1c4 llvmpipe: (trivial) get rid of triangle subdivision code
This code was always problematic, and with 64bit rasterization we no longer
need it at all.

Reviewed-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-12-14 17:11:03 +01:00
Matthew McClure 0319ea9ff6 llvmpipe: clamp fragment shader depth write to the current viewport depth range.
With this patch, generate_fs_loop will clamp any fragment shader depth writes
to the viewport's min and max depth values. Viewport selection is determined
by the geometry shader output for the viewport array index. If no index is
specified, then the default viewport index is zero. Semantics for this path
can be found in draw_clamp_viewport_idx and lp_clamp_viewport_idx.

lp_jit_viewport was created to store viewport information visible to JIT code,
and is validated when the LP_NEW_VIEWPORT dirty flag is set.

lp_rast_shader_inputs is responsible for passing the viewport_index through
the rasterizer stage to fragment stage (via lp_jit_thread_data).

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-12-09 12:57:02 +00:00
Zack Rusin 0510ec67e2 llvmpipe: support 8bit subpixel precision
8 bit precision is required by d3d10 but unfortunately
requires 64 bit rasterizer. This commit implements
64 bit rasterization with full support for 8bit subpixel
precision. It's a combination of all individual commits
from the llvmpipe-rast-64 branch.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-11-25 13:05:03 -05:00
Zack Rusin edde6c77bd llvmpipe: abstract the code to set number of subpixel bits
As we're moving towards expanding the number of subpixel
bits and the width of the variables used in the computations
we need to make this code a bit more centralized.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-10-09 18:30:31 -04:00
Zack Rusin 60c448faea llvmpipe: count c_primitives before discarding null prims
We need to count the clipper primitives before the rasterizer
discards one it considers to be null.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-25 19:41:02 -04:00
Zack Rusin 71ecc2cf71 Revert "llvmpipe: increase number of subpixel bits to eight"
This reverts commit 755c11dc5e.
We agreed that this is band-aid that's not very useful and
the proper solution is to rewrite the rasterization algo
so that it operates on 64 bit values.

Signed-off-by: Zack Rusin <zackr@vmware.com>
2013-09-24 15:10:02 -04:00
Zack Rusin e5ec5aef2b llvmpipe: align the array used for subdivived vertices
When subdiving a triangle we're using a temporary array to store
the new coordinates for the subdivided triangles. Unfortunately
the array used for that was not aligned properly causing
random crashes in the llvm jit code which was trying to load
vectors from it.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-23 18:10:51 -04:00
Zack Rusin 755c11dc5e llvmpipe: increase number of subpixel bits to eight
Unfortunately d3d10 requires a lot higher precision (e.g.
wgf11clipping tests for it). The smallest number of precision
bits with which it passes is 8. That means that we need to
decrease the maximum length of an edge that we can handle without
subdivision by 4 bits. Abstracted the code a bit to make it easier
to change once to switch to 64bit rasterization.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-09-23 14:53:07 -04:00
Zack Rusin 27cedd8aec llvmpipe: fix pipeline statistics with a null ps
If the fragment shader is null then pixel shader invocations have
to be equal to zero. And if we're running a null ps then clipper
invocations and primitives should be equal to zero but only
if both stancil and depth testing are disabled.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-08-14 18:23:36 -04:00
Roland Scheidegger 59b8689d37 llvmpipe: fix a bug in opaque optimization
If there are queries active the opaque optimization reseting the bin needs to
be disabled.
(Not really tested since the bug was discovered by code inspection not
an actual test failure.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-27 19:06:40 +02:00
Roland Scheidegger 2e4da1f594 llvmpipe: add support for nested / overlapping queries
OpenGL doesn't support this but d3d10 does.
It is a bit of a pain as it is necessary to keep track of queries
still active at the end of a scene, which is also why I cheat a bit
and limit the amount of simultaneously active queries to (arbitrary)
16 (simplifies things because don't have to deal with a real list
that way). I can't think of a reason why you'd really want large
numbers of overlapping/nested queries so it is hopefully fine.
(This only affects queries which need to be binned.)

v2: don't copy remainder of array when deleting an entry simply replace
the deleted entry with the last one (order doesn't matter).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 23:17:53 +02:00
Roland Scheidegger 0820342880 llvmpipe: rework query logic
Previously lp_rast_begin_query commands were always inserted into each bin,
and re-issued if the scene was restarted, while lp_rast_end_query commands
were executed for each still active query at the end of tile rasterization.
Also, the ps_invocations and vis_counter were set to zero when the respective
command was encountered.
This however cannot work for multiple queries of the same type (note that
occlusion counter and occlusion predicate while different type were also
affected).
So, change the logic to always set the ps_invocations and vis_counter to zero
at the start of tile rasterization, and then use "start" and "end" per-thread
query values when encountering the begin/end query commands instead, which
should work for multiple queries of the same type. This also means queries do
not have to be reissued in a new scene, however they still need to be finished
at end of tile rasterization, so a list of queries still active at the end of
a scene needs to be maintained.
Also while here don't bin the queries which don't do anything in rasterization.
(This change does not actually handle multiple queries of the same type yet,
as the list of active queries is just a simple fixed array and setup can still
only have one query active per type.)

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-26 23:17:53 +02:00
Roland Scheidegger dc5dc4fd94 llvmpipe: handle more queries
Handle PIPE_QUERY_GPU_FINISHED and PIPE_QUERY_TIMESTAMP_DISJOINT, and
also fill out the ps_invocations and c_primitives from the
PIPE_QUERY_PIPELINE_STATISTICS (the others in there should already
be handled). Note that ps_invocations isn't pixel exact, just 16 pixel
exact but I guess it's better than nothing.
Doesn't really seem to work correctly but there's probably bugs elsewhere.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-19 23:47:36 +02:00
Roland Scheidegger d8146f240e llvmpipe: add support for layered rendering
Mostly just make sure the layer parameter gets passed through to the right
places (and get clamped, can do this at setup time), fix up clears to
clear all layers and disable opaque optimization. Luckily don't need to
touch the jitted code.
(Clears invoked via pipe's clear_render_target method will not work however
since the pipe_util_clear function used for it doesn't handle clearing
multiple layers yet.)

v2: per Brian's suggestion, prettify var initialization and add some comments,
add assertion for impossible layer specification for surface.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-06-07 21:15:01 +02:00
Zack Rusin c88ce3480c llvmpipe: clamp scissors to be between 0 and max
We need to clamp to make sure invalid shader doesn't crash our
driver. The spec says to return 0-th index for everything that's
out of bounds.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2013-05-25 09:49:20 -04:00
Zack Rusin 97b8ae429e llvmpipe: implement support for multiple viewports
Largely related to making sure the rasterizer can correctly
pick out the correct scissor box for the current viewport.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca<jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2013-05-25 09:49:20 -04:00
José Fonseca 2737abb44e gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center.
Squashed commit of the following:

commit 04c5fa2cbb8e89d6f2fa5a75af1cca03b1f6b852
Author: José Fonseca <jfonseca@vmware.com>
Date:   Tue Apr 23 17:37:18 2013 +0100

    gallium: s/lower_left_origin/bottom_edge_rule/

commit 4dff4f64fa83b9737def136fffd161d55e4f1722
Author: José Fonseca <jfonseca@vmware.com>
Date:   Tue Apr 23 17:35:04 2013 +0100

    gallium: Move diagram to docs.

commit 442a63012c8c3c3797f45e03f2ca20ad5f399832
Author: James Benton <jbenton@vmware.com>
Date:   Fri May 11 17:50:55 2012 +0100

    gallium: Replace gl_rasterization_rules with lower_left_origin and half_pixel_center.

    This change is necessary to achieve correct results when using OpenGL
    FBOs.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-23 19:42:47 +01:00
Brian Paul 1165ff1af1 llvmpipe: use triangle subdivision to avoid fixed-point overflow issues
If we're drawing to a surface that's 2048 x 2048 pixels or larger there's
danger of fixed-point overflow in the triangle rasterization code.  That
leads to various rendering glitches.

Rather than implement some intricate changes to the rasterization code,
simply subdivide triangles into smaller subtriangles to avoid the issue.
Only do this when the drawing surface is larger than 2048 by 2048.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-01 08:40:35 -06:00
Brian Paul e90c56bc4e llvmpipe: add 'f' suffix to 1.0 in fixed_to_float() 2013-03-28 17:17:26 -06:00
Olivier Galibert 1ec421823b llvmpipe: Don't mess with the provoking vertex when inverting a triangle.
Fixes a bunch of piglit tests related to flat interpolation of floats.

Signed-off-by: Olivier Galibert <galibert@pobox.com>
Signed-off-by: José Fonseca <jose.r.fonseca@gmail.com>
2012-05-18 00:07:18 +01:00
James Benton 24678700ed llvmpipe: Calculate fixed point coordinates for triangle setup earlier.
This allows us to calculate the triangle's area using fixed point,
previously it was cacluated in floating point space. It was possible
that a triangle which had negative area in floating point space had
a positive area in fixed point space.

Fixes fdo 40920.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2012-05-14 16:07:49 +01:00
James Benton 11aa82cc0b llvmpipe: Fix triangle bounding box calculation to be correctly inclusive or exclusive
Tested with custom rasterisation test tool added to piglit suite, reduced errors

Signed-off-by: José Fonseca <jfonseca@vmware.com>
2012-05-11 13:21:23 +01:00
José Fonseca e2072a1046 llvmpipe: Fix the 4 planes (lines) case properly.
The previous change was not effective for lines, because there is no
4 planes 4x4 block rasterization path: it is handled by the 16x16 block
case too, and the 16x16 block was not being budged as it should.

This fixes assertion failures on line rasterization.
2011-10-05 18:07:05 +01:00
José Fonseca c620087432 llvmpipe: Ensure the 16x16 special rasterization path does not touch outside the tile.
llvmpipe has a few special rasterization paths for triangles contained in
16x16 blocks, but it allows the 16x16 block to be aligned only to a 4x4
grid.

Some 16x16 blocks could actually intersect the tile
if the triangle is 16 pixels in one dimension but 4 in the other, causing
a buffer overflow.

The fix consists of budging the 16x16 blocks back inside the tile.
2011-10-05 18:07:05 +01:00
Brian Paul c8f1687ce7 llvmpipe: added some debug assertions, but disabled 2010-11-04 18:21:45 -06:00
Keith Whitwell 98445b4307 llvmpipe: avoid generating tri_16 for tris which extend past tile bounds
Don't trim triangle bounding box to scissor/draw-region until after
the logic for emitting tri_16.  Don't generate tri_16 commands for
triangles with untrimmed bounding boxes outside the current tile.

This is important as the tri-16 itself can extend past tile bounds and
we don't want to add code to it to check against tile bounds (slow) or
restrict it to locations within a tile (pessimistic).
2010-11-02 16:48:10 +00:00
Keith Whitwell 9da17fed2e llvmpipe: remove unused arg from jit_setup_tri function 2010-10-17 19:23:40 -07:00