Commit Graph

56069 Commits

Author SHA1 Message Date
Zack Rusin e6fc635351 draw/gs: Fix memory corruption with multiple primitives
We were flushing with incorrect number of primitives. TGSI exec
can only work with a single primitive at a time. Plus the fetching
with multiple primitives on llvm paths wasn't copying the last
element.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Zack Rusin f313b0c850 gallivm: cleanup the gs interface
Instead of void pointers use a base interface.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-04-03 10:16:25 -07:00
Brian Paul ac114c6824 svga: add new memory-used HUD query
To track the amount of memory used by all pipe_resources (textures
and buffers).

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 11:02:47 -06:00
Brian Paul a69efa9482 util: add new util_resource_size() function in u_resource.[ch]
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 11:02:47 -06:00
Brian Paul a3cccdec90 util: move functions from u_resource.c to u_transfer.c
The functions are prototyped in u_transfer.h and are related to the
other functions in u_transfer.c.

The next patch will re-use the u_resource.c file for new code.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 11:02:47 -06:00
Vincent Lejeune 159d934066 r600g/llvm: Do not override llvm provided stack_size 2013-04-03 18:39:49 +02:00
Vincent Lejeune 097a6ecdfe r600g/llvm: Do not change cf_alu inst when adding alus 2013-04-03 18:22:40 +02:00
Marek Olšák ff01e0db0e radeonsi: add more cases for copying unsupported formats to resource_copy_region
Ported from r600g commit:

8891b2f9c9

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Christian König <christian.koenig@amd.com>

NOTE: This is a candidate for the 9.1 branch.
2013-04-03 10:58:33 -04:00
Brian Paul 3838edaf5d svga: add HUD queries for number of draw calls, number of fallbacks
The fallbacks count is the number of drawing calls that use a "draw"
module fallback, such as polygon stipple.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 09:56:08 -06:00
Brian Paul 49ed1f3cb3 svga: refactor occlusion query code
This is in preparation for adding new query types for the HUD.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-03 09:56:07 -06:00
Brian Paul a9ae7e9c28 gallium/hud: try L8 texture for font if I8 format isn't supported 2013-04-03 09:44:57 -06:00
Brian Paul 0289ebaa0f svga: add case for PIPE_CAP_QUERY_PIPELINE_STATISTICS 2013-04-03 08:19:44 -06:00
Brian Paul 7e28debb6f st/mesa: rewrite comment in st_manager.c 2013-04-03 08:16:36 -06:00
Christoph Bumiller 80eef069f0 nv50,nvc0: remove MS resolve formats hack
Mesa now allows BlitFramebuffer resolve between RGBA and BGRA.
2013-04-03 13:19:15 +02:00
Christoph Bumiller 4de70bf43c nvc0: fix 128 bit compressed storage type selection 2013-04-03 12:54:44 +02:00
Christoph Bumiller 8e1dd58a7e nvc0: place staging textures in GART and map them directly 2013-04-03 12:54:44 +02:00
Christoph Bumiller ba9b0b682f nv50: account for pesky prefetch in size calculation of linear textures 2013-04-03 12:54:44 +02:00
Christoph Bumiller f0a0d59f0f nvc0: honour scaled coordiantes setting for linear textures 2013-04-03 12:54:44 +02:00
Christoph Bumiller d801545964 nvc0: fix for 2d engine R source formats writing RRR1 and not R001 2013-04-03 12:54:43 +02:00
Christoph Bumiller 6417d56c19 nv50,nvc0: disable DEPTH_RANGE_NEAR/FAR clipping during blit
We send position.z == 0, DEPTH_RANGE may be some arbitrary range
not including 0 (for exmaple in piglit's hiz tests).
2013-04-03 12:54:43 +02:00
Christoph Bumiller e45c969fe5 st/mesa: fix bitmap,drawpix,drawtex for PIPE_CAP_TGSI_TEXCOORD
NOTE: Changed the semantic index for the drawtex coordinate to
be the texture unit index instead of always 0.
Not sure if this is correct but since the value seems to depend
on the unit it would make sense to use different varying slots.
2013-04-03 12:54:43 +02:00
Christoph Bumiller 2a8145d36b nouveau: accelerate buffer copies in resource_copy_region 2013-04-03 12:54:43 +02:00
Christoph Bumiller 3ed4bbd769 nvc0: demagic some of the NVE4_COMPUTE_UPLOAD methods
It's actually the same as P2MF.
2013-04-03 12:54:43 +02:00
Christoph Bumiller fb0334adb3 nvc0: read PM counters for each warp scheduler separately 2013-04-03 12:54:43 +02:00
Christoph Bumiller 7bac075f25 nvc0: add some metrics to driver specific queries 2013-04-03 12:54:43 +02:00
Christoph Bumiller 198f514aa6 nvc0: add some driver statistics queries 2013-04-03 12:54:43 +02:00
Christoph Bumiller 7628cc247f nvc0: disable compressed storage type 0xdb for now
Single-sample color compression doesn't seem that useful anyway.
2013-04-03 12:54:43 +02:00
Christoph Bumiller ea12fc3f6c nvc0: use correct hw query for PRIMITIVES_GENERATED
It was the same as SO_STATISTICS[1] before.
2013-04-03 12:54:43 +02:00
Christoph Bumiller 6bca4e7085 nvc0: use fence to check state of queries that don't write sequence
This still isn't optimal, since the fence will signal a bit late,
but better than checking on the bo, which may never be ready if it
is shared (which is likely).
2013-04-03 12:54:43 +02:00
Christoph Bumiller 3d2790cead gallium/hud: add support for PIPE_QUERY_PIPELINE_STATISTICS
Also, renamed "pixels-rendered" to "samples-passed" because the
occlusion counter increments even if colour and depth writes are
disabled, or (on some implementations) for killed fragments that
passed the depth test when PS early_fragment_tests is set.
2013-04-03 12:54:43 +02:00
Christoph Bumiller c620aad71c gallium/docs: fix definition of PIPE_QUERY_SO_STATISTICS
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-03 12:54:43 +02:00
Christoph Bumiller f35e96d973 gallium: add PIPE_CAP_QUERY_PIPELINE_STATISTICS
Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-03 12:54:43 +02:00
Paul Berry 41e4bccc75 i965: Reduce code duplication in handling of depth, stencil, and HiZ.
This patch consolidates duplicate code in the brw_depthbuffer and
gen7_depthbuffer state atoms.  Previously, these state atoms contained
5 chunks of code for emitting the _3DSTATE_DEPTH_BUFFER packet (3 for
Gen4-6 and 2 for Gen7).  Also a lot of logic for determining the
appropriate buffer setup was duplicated between the Gen4-6 and Gen7
functions.

This refactor splits the code into three separate functions:
brw_emit_depthbuffer(), which determines the appropriate buffer setup
in a mostly generation-independent way, brw_emit_depth_stencil_hiz(),
which emits the appropriate state packets for Gen4-6, and
gen7_emit_depth_stencil_hiz(), which emits the appropriate state
packets for Gen7.

Tested using Piglit on Gen5-7 (no regressions).

v2: Re-word some comments.  Fix an assertion that incorrectly
prohibited packed depth/stencil formats on Gen6 (these are allowed
provided that HiZ is disabled).

Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2013-04-02 15:19:13 -07:00
Paul Berry 2ad0ed6349 Revert "glsl: Replace constant-index vector array accesses with swizzles"
This reverts commit dbf94d105a, which
was working around a bug in the handling of array indexing when
constant folding built-in functions.  Now that the constant folding
bug has been fixed, the workaround is no longer needed.
2013-04-02 12:24:16 -07:00
Paul Berry 7d4f1e6467 glsl: Fix array indexing when constant folding built-in functions.
Mesa constant-folds built-in functions by using a miniature GLSL
interpreter (see
ir_function_signature::constant_expression_evaluate_expression_list()).
This interpreter had a bug in its handling of array indexing, which
caused expressions like "m[i][j]" (where m is a matrix) to be handled
incorrectly.  Specifically, it incorrectly treated j as indexing into
the whole matrix (rather than indexing just into the vector m[i]); as
a result the offset computed for m[i] was lost and m[i][j] was treated
as m[j][0].

Fixes piglit tests inverse-mat[234].{vert,frag}.

NOTE: This is a candidate for the 9.1 and 9.0 branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=57436
2013-04-02 12:24:08 -07:00
Roland Scheidegger 450950c57a gallivm: bring back optimized but incorrect float to smallfloat optimizations
Conceptually the same as previously done in float_to_half.
Should cut down number of instructions from 14 to 10 or so, but
will promote some NaNs to Infs, so it's disabled.
It gets a bit tricky though handling all the cases correctly...
Passes basic tests either way (though there are no tests testing special
cases, but some manual tests injecting them seemed promising).

v2: style and comment fixes suggested by Jose

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-02 18:24:31 +02:00
Roland Scheidegger 3febc4a1cd gallivm: consolidate code for float-to-half and float-to-packed conversion.
This replaces the existing float-to-half implementation.
There are definitely a couple of differences - the old implementation
had unspecified(?) rounding behavior, and could at least in theory
construct Inf values out of NaNs. NaNs and Infs should now always be
properly propagated, and rounding behavior is now towards zero
(note this means too large but non-Infinity values get propagated to max
representable value, not Infinity).
The implementation will definitely not match util code, however (which
does nearest rounding, which also means too large values will get
propagated to Infinity).

Also fix a bogus round mask probably leading to rounding bugs...
v2: fix a logic bug in handling infs/nans.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-04-02 18:24:31 +02:00
Vadim Girlin 9be624b3ef r600g: don't reserve more stack space than required v5
Reduced stack size allows to run more threads in some cases,
improving performance for the shaders that use stack (that is, for the
shaders with control flow instructions). E.g. with unigine-based apps.

v4: implement exact computation taking into account wavefront size
v5: add cases for RV620, RS880

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-02 19:34:14 +04:00
Vadim Girlin 7e04227f39 r600g: fix range handling for tgsi input declarations v2
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-04-02 19:34:14 +04:00
Marek Olšák f8502b7e71 gallium/hud: do .xxxx swizzling for the font texture in the fragment shader
This allows using L8 and R8 for the font if I8 isn't supported.

Tested-by: Brian Paul <brianp@vmware.com>
2013-04-02 16:57:57 +02:00
Brian Paul 98b64cc20f hud: flush/unmap the vertex buffer before drawing
The VMware svga driver is picky about making sure the VBO is unmapped
before drawing.

Reviewed-by: Marek Olšák <maraeo@gmail.com>
2013-04-02 08:17:28 -06:00
Brian Paul bdd3770b78 draw: use pipe_transfer_unmap() to match pipe_transfer_map() 2013-04-02 08:17:28 -06:00
Roland Scheidegger 9b329f4c09 gallivm: fix signed small float to float conversion
Introduced by 5f41e08cf3,
just a silly typo.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=62921.
2013-04-02 13:21:07 +02:00
Christian König a0dca4409a radeonsi: add instance divisor support v3
v2: reduce key size, don't copy key around to much.
v3: remove key size reduction

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König cf9b31f78a radeonsi: add start instance support
This works different than on R600, we need to add the start instance manually.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König e4ed58763a radeonsi: add instanceid support
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:43 +02:00
Christian König 83df955ca9 radeon/llvm: move system value fetching to common code
This should be used by both SI and R600.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Tested-by: Michel Dänzer <michel.daenzer@amd.com>
2013-04-02 13:01:42 +02:00
Michel Dänzer c6efb4870b radeonsi: Handle arbitrary 2-byte formats in resource_copy_region
Fixes mplayer -vo vdpau OSD.

NOTE: This is a candidate for the 9.1 branch.

Reported-by: Igor Vagulin <igor.vagulin@gmail.com>

Reviewed-by: Christian König <christian.koenig@amd.com>
Tested-by: Christian König <christian.koenig@amd.com>
2013-04-02 11:42:35 +02:00
Maarten Lankhorst 6d20c646d6 nvc0: Fix fd leak in nvc0_create_decoder
NOTE: This is a candidate for the 9.0 and 9.1 branches.

Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
2013-04-02 10:25:26 +02:00
Aras Pranckevicius b2eee0869f GLSL: fix lower_jumps to report progress properly
A fix for lower_jumps progress reporting, very much like similar in
c1e591eed.

NOTE: This is a candidate for stable branches.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2013-04-01 16:57:17 -07:00