Commit Graph

56516 Commits

Author SHA1 Message Date
Eric Anholt 739b88330c glsl: Flip around "if" statements with empty "then" blocks.
This cleans up some funny-looking code in some unigine shaders I was
looking at.  Also slightly helps on planeshift and a few shaders in an
upcoming Valve release.

total instructions in shared programs: 1653715 -> 1653587 (-0.01%)
instructions in affected programs:     16550 -> 16422 (-0.77%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-05 13:20:42 -07:00
Chia-I Wu 008346273c ilo: correctly set return types of sampler messages
Correctly set the types of the temporaries.  We do not want type conversions
when moving the results to the final destinations.
2013-05-05 14:36:39 +08:00
Vincent Lejeune b42fe195a2 r600g/llvm: Undefines unrequired texture coord values
This is a port of "r600g:mask unused source components for SAMPLE"
patch from Vadim Girlin.
2013-05-04 23:38:50 +02:00
Maarten Lankhorst c4150123aa nvc0: fixup video decoding with 2D_ARRAY
Signed-off-by: Maarten Lankhorst <m.b.lankhorst@gmail.com>
2013-05-04 20:56:23 +02:00
Chia-I Wu 8c347d4e57 gallium: fix type of flags in pipe_context::flush()
It should be unsigned, not enum pipe_flush_flags.

Fixed a build error:

  src/gallium/state_trackers/egl/android/native_android.cpp:426:29: error:
  invalid conversion from 'int' to 'pipe_flush_flags' [-fpermissive]

v2: replace all occurrences of enum pipe_flush_flags by unsigned

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>

[olv: document the parameter now that the type is unsigned]
2013-05-04 17:32:10 +08:00
Eric Anholt cbf3462c35 i965: Enable fast clears on non-8x4-aligned sizes.
Improves glb2.7 performance at a misaligned size by 2.3% +/- 0.7% (n=11).
The workaround was to avoid bad primitive/surface sizes, but that's worked
around as of a14dc4f92c.  (One might note
that pre-gen7 we don't know that the right half of an 8x4 at the right
edge is actually our pixels, but we're already clobbering those pixels for
depth resolves anyway and more work would be required to avoid that).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Chad Versace <chad.versace@linux.intel.com>
2013-05-03 20:59:51 -07:00
Brian Paul 76084907fb vbo: add comments, const qualifiers
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul 0baf32508a mesa: whitespace, formatting fixes, etc in api_arrayelt.c
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul 7c9e5afe81 vbo: use new no-op ArrayElement in _mesa_noop_vtxfmt_init()
As we do for the other commands which can appear between glBegin/End.

Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul 7b762305d5 mesa: change ctx->Driver.NeedFlush to GLbitfield and update comment
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul 36c83ccca0 mesa; change ctx->Driver.SaveNeedFlush to boolean, and document it.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul af30987a69 vbo: update comments for vbo_save_NotifyBegin()
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul 4ea05bcba6 vbo: implement primitive merging for glBegin/End sequences
A surprising number of apps and benchmarks have poor code like this:

glBegin(GL_LINE_STRIP);
glVertex(v1);
glVertex(v2);
glEnd();
// Possibly some no-op state changes here
glBegin(GL_LINE_STRIP);
glVertex(v3);
glVertex(v4);
glEnd();
// repeat many, many times.

The above sequence can be converted into:

glBegin(GL_LINES);
glVertex(v1);
glVertex(v2);
glVertex(v3);
glVertex(v4);
glEnd();

Similarly for GL_POINTS, GL_TRIANGLES, etc.

Merging was already implemented for GL_QUADS in the display list code.
Now other prim types are handled and it's also done for immediate mode.

In one case:
                                 before   after
-----------------------------------------------
number of st_draw_vbo() calls:     141      45
number of _mesa_prims issued:     7520     632

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Brian Paul 3702d25082 vbo: create a few utility functions for merging primitives
To be used by following commit.

Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 19:00:07 -06:00
Zack Rusin a232afdbfb draw/pt: adjust overflow calculations
gallium lies. buffer_size is not actually buffer_size but available
size, which is 'buffer_size - buffer_offset' so by adding buffer
offset we'd incorrectly compute overflow.

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 07:07:33 -04:00
Zack Rusin 8490d21cbe tgsi/ureg: make the dst register match the src indirection
In ureg src registers could have an indirect register that was
either a temp or an addr register, while dst registers allowed
only addr. That made moving between them a little difficult so
make them behave the same way and allow temp's and addr registers
as indirect files for both (tgsi supports it, just ureg didn't).

Signed-off-by: Zack Rusin <zackr@vmware.com>
Reviewed-by: José Fonseca <jfonseca@vmware.com>
2013-05-03 07:07:33 -04:00
Roland Scheidegger 23025ed15d gallium: tgsi documentation updates and clarification for integer opcodes.
A lot of them were missing. Others were moved from the Compute ISA
to a new Integer ISA section as that seemed more appropriate.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 21:36:28 +02:00
Roland Scheidegger ae507b6260 llvmpipe: get rid of depth swizzling.
Eliminating this we no longer need to copy between linear and swizzled layout.
This is probably not quite ideal since it's a bit more work for now, could do
some optimizations by moving depth testing outside the fragment shader loop
(but tricky for early depth test as we don't have neither the mask nor the
interpolated z in the right order handy).
The large amount of tile/untile code is no longer needed will be deleted
in next commit.
No piglit regressions.
v2: change a forgotten LAYOUT_NONE to LAYOUT_LINEAR.
v3: fix (bogus) uninitialized variable warnings, add comments, fix a bad type

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 21:36:20 +02:00
Lauri Kasanen e495d88453 r600g: Correctly initialize the shader key, v2
Assigning a struct only copies the members - any padding is left as is.

Thus this code:

struct foo_t foo;
foo = bar;

leaves the padding of foo intact, ie uninitialized random garbage.

This patch fixes constant shader recompiles by initializing the struct
to zero. For completeness, memcpy is used to copy the key to the shader
struct.

NOTE: This is a candidate for the stable branches.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-05-03 19:28:57 +02:00
Lauri Kasanen 5ff81cfd86 st/xvmc/tests: Fix build failure, v2
v2: Removed extra libs as requested by Matt Turner.

Signed-off-by: Lauri Kasanen <cand@gmx.com>
Reviewed-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Andreas Boll <andreas.boll.dev@gmail.com>
2013-05-03 19:14:54 +02:00
Andreas Boll e62be5de53 scons: remove nouveau build
One build system for linux/unix only drivers should be enough.
Additionally the nouveau target was disabled anyway.

Acked-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 18:44:57 +02:00
Andreas Boll 4ca44f2c5e scons: remove radeon build
One build system for linux/unix only drivers should be enough.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=48694

Acked-by: Jose Fonseca <jfonseca@vmware.com>
2013-05-03 18:44:43 +02:00
Alex Deucher 4539f8e20a r600g: don't emit surface_sync after FLUSH_AND_INV_EVENT
It shouldn't be needed since the FLUSH_AND_INV_EVENT has already
made sure the destination caches are flushed.  Additionally,
we didn't previously emit the surface_sync until this commit:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=e5e4c07e7964a3258ed02b530bcdc24c0650204b
Emitting them together causes hangs in compute on cayman/TN
and hangs in Heaven on evergreen.

Note: this patch is a candidate for the 9.1 branch, but requires:
http://cgit.freedesktop.org/mesa/mesa/commit/?id=156bcca62c9f4e79e78929f72bc085757f36a65a
as well.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Marek Olšák <maraeo@gmail.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2013-05-03 10:55:05 -04:00
Vadim Girlin 41005d7bd2 r600g/sb: zero-initialize bytecode structs
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin f92bd0958e r600g/sb: fix constant propagation in gvn pass
Fixes the bug that prevented propagation of literals in some cases.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin 3c201a22ca r600g/sb: don't run unnecessary passes
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin 48ba5712f5 r600g/sb: silence warnings with gcc 4.8
Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin c49b6d7f27 r600g/sb: fix handling of interference sets in post_scheduler
post_scheduler clears interference set for reallocatable values when
the value becomes live first time, and then updates it to take into
account modified order of operations, but this was not handled properly
if the value appears first time as a source in copy operation.

Fixes issues with webgl demo: http://madebyevan.com/webgl-water/

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:42 +04:00
Vadim Girlin e16ef1f454 r600g/sb: fix allocation of indirectly addressed input arrays
Some inputs may be preloaded into predefined GPRs,
so we can't reallocate arrays with such inputs.

Fixes issues with webgl demo: http://oos.moxiecode.com/js_webgl/snake/

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:41 +04:00
Vadim Girlin a6fe055fa7 r600g/sb: use hex instead of binary constants
This should fix build issues with GCC < 4.3

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:41 +04:00
Vadim Girlin 4ca67dbf0c r600g: use old shader disassembler by default
New disassembler is not completely isolated yet from further processing
in r600g/sb that is not required for printing the dump, so it has higher
probability to fail in case of any unexpected features in the bytecode.

This patch adds "sbdisasm" flag for R600_DEBUG that allows to use new
disassembler in r600g/sb for shader dumps when shader optimization
is not enabled.

If shader optimization is enabled, new disassembler is used by default.

Signed-off-by: Vadim Girlin <vadimgirlin@gmail.com>
2013-05-03 16:53:41 +04:00
Christian König b4b3041132 radeon/uvd: enable interlaced buffers by default
Kills tilling on UVD buffers, but we currently don't really need that.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:21 +02:00
Christian König 85b0880a17 vl/idct: fix for commit 7d2f2a0c89
We still need the option for handling 3D textures as well.

Should fix: https://bugs.freedesktop.org/show_bug.cgi?id=64143

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:21 +02:00
Christian König 379753869d vl/buffers: fix typo in function name
Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:20 +02:00
Christian König 9c353ea293 radeon/uvd: fix some MPEG4 artifacts
Still not perfect, but a step in the right direction.

Signed-off-by: Christian König <christian.koenig@amd.com>
2013-05-03 11:00:20 +02:00
José Fonseca abbbc9b667 draw: Update for u_assembled_primitive -> u_assembled_prim rename.
Mesa build is too complex to rely on successful builds. On refactorings
it is always a good idea to use git grep to prevent missing cases:

  $ git grep u_assembled_primitive
  src/gallium/auxiliary/draw/draw_pt_fetch_shade_pipeline_llvm.c:      u_assembled_primitive(in_prim);
2013-05-03 08:35:17 +01:00
Chia-I Wu 8b2a967e32 st/egl: fix bulid errors on Android 4.2
The differences from the previous releases that affect st/egl are

 - logging macros are prefixed with an 'A'
 - dequeueBuffer() and enqueueBuffer() require an additoinal argument for
   fence fd, acquired from libsync

Additionally, include gralloc_drm.h with extern "C".
2013-05-03 13:04:00 +08:00
Chia-I Wu 7346ab3b43 ilo: use u_reduced_prims_for_vertices()
We do not need our own prim_count() anymore.
2013-05-03 11:59:10 +08:00
Chia-I Wu f87dccdc19 util/prim: add u_reduced_prims_for_vertices()
The function returns the number of reduced/tessellated primitives for the
given vertex count.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu 90d5190594 util/prim: assorted fixes for u_decomposed_prims_for_vertices()
Switch to '>=' for comparisons, and it becomes obvious that the comparison for
PIPE_PRIM_QUAD_STRIP was wrong.

Add minimum vertex count check for PIPE_PRIM_LINE_LOOP.  Return 1 for
PIPE_PRIM_POLYGON with 3 vertices.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu 30671cecc0 util/prim: use vertex count info in u_validate_pipe_prim()
As a side effect, primitives with adjacency are now correctly validated.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu ddf0e3930f util/prim: fix the name of the include guard
It should be U_PRIM_H, not U_BLIT_H.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu 5dd3bd70a1 draw: use u_assembled_prim() instead of u_assembled_primitive()
The latter function is also removed as a result of the change.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:59:10 +08:00
Chia-I Wu 185692e72c util/prim: clean up and add comments
Move together (or add) functions to decompose/reduce/assemble a primitive,
give them consistent names, and document them.  Add u_prim_vertex_count() so
that the vertex count information can be used elsewhere.

u_assembled_primitive() will be removed in a folow-on commit.

[olv: fix a warning when -Wold-style-declaration is enabled]

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:58:57 +08:00
Chia-I Wu 64913002e4 util/prim: fix primitive trimming for triangles with adjacency
Fix for PIPE_PRIM_TRIANGLES_ADJACENCY and PIPE_PRIM_TRIANGLE_STRIP_ADJACENCY.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Acked-by: Zack Rusin <zackr@vmware.com>
2013-05-03 11:39:12 +08:00
Eric Anholt 573d8813fd i965/vs: Add instruction scheduling.
While this is ignorant of dependency control, it's still good for a 0.39%
+/- 0.08% performance improvement on GLBenchmark 2.7 (n=548)

v2: Rewrite as a subclass of the base class for the FS instruction
    scheduler, inheriting the same latency information.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:47 -07:00
Eric Anholt 3b00a6acac i965: Move most of the FS instruction scheduler code to a general class.
About half of this is shareable with the VS code.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:43 -07:00
Eric Anholt ce22dd75b7 i965: Pull a couple of FS scheduling functions out to methods.
These will get virtualized as we add VS scheduling support.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:39 -07:00
Eric Anholt ee0223ba2a i965: Move FS instruction scheduling to a non-FS-specific file.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:35 -07:00
Eric Anholt ab04f3b2d7 i965: Share the register file enum between the two backends.
I need this so I can look at vec4 and fs registers' files from the same
.cpp file without namespaces.  As far as I can tell we never rely on the
particular numerical values of the files, though I thought it sounded like
a good idea when doing the VS (it turns out having 0 be BAD_FILE is nicer).

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2013-05-02 15:54:31 -07:00