Commit Graph

63391 Commits

Author SHA1 Message Date
Matt Turner 4bb9d16fd3 configure.ac: Simplify DUSE_EXTERNAL_DXTN_LIB logic.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-06-11 20:09:22 -07:00
Matt Turner 026d1fe986 configure.ac: Alphabetize AC_CONFIG_FILES.
This isn't supposed to be difficult.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-06-11 20:09:22 -07:00
Matt Turner 180e60df65 configure.ac: Remove single quotes to fix syntax highlighting.
Please stop adding them.

Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-06-11 20:09:22 -07:00
Robert Bragg c6f118484c meta: save and restore swizzle for _GenerateMipmap
This makes sure to use a no-op swizzle while iteratively rendering each
level of a mipmap otherwise we may loose components and effectively
apply the swizzle twice by the time these levels are sampled.

Signed-off-by: Robert Bragg <robert@sixbynine.org>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-11 21:38:01 +01:00
Ian Romanick 63117ac329 i965/vec4: Emit smarter code for b2f of a comparison
Previously we would emit the comparison, emit an AND to mask off extra
bits from the comparison result, then convert the result to float.  Now,
do the comparison, then use a cleverly constructed SEL to pick either
0.0f or 1.0f.

No piglit regressions on Ivybridge.

total instructions in shared programs: 1642311 -> 1639449 (-0.17%)
instructions in affected programs:     136533 -> 133671 (-2.10%)
GAINED:                                0
LOST:                                  0

Programs that are affected appear to save between 1 and 5 instuctions
(just by skimming the output from shader-db report.py.

v2: s/b2i/b2f/ in commit subject (noticed by Chris Forbes).  Remove
extraneous fix_3src_operand (suggested by Matt).  The latter change
required swapping the order of the operands and using predicate_inverse.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-06-11 12:00:24 -07:00
Ian Romanick be0452b049 i965/vec4: Silence a couple unused parameter warnings
brw_vec4_visitor.cpp:2717:1: warning: unused parameter 'ir' [-Wunused-parameter]
brw_vec4_visitor.cpp:2723:1: warning: unused parameter 'ir' [-Wunused-parameter]

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-06-11 12:00:20 -07:00
Ian Romanick 014d45f137 glsl: Store gl_uniform_driver_storage::format as the actual type
And delete the incorrect comment.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2014-06-11 11:26:05 -07:00
Dave Airlie 0d89448662 softpipe: fix pt->resource assert placement
oops meant to move this.

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 14:03:11 +10:00
Dave Airlie 9bc12ef241 softpipe: enable AMD_vertex_shader_layer.
This passes tests now on softpipe.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:21:21 +10:00
Dave Airlie 8dede2fa6c softpipe: enable GLSL 3.30 support.
This enables GL3.3 on softpipe.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:21:17 +10:00
Dave Airlie c82d227edd softpipe: bump the softpipe geometry limits
This just aligns the limits with llvmpipe.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:21:08 +10:00
Dave Airlie 7ea04f089b tgsi_exec: use defines for max inputs/outputs
This fixes the limits for GL 3.2, and subsequently fixes
some segfaults in some varying packing tests and max varying tests
after the limits bumped.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:21:04 +10:00
Dave Airlie 740d5bed77 softpipe: add layered rendering support.
This adds support for GL 3.2 layered rendering to softpipe.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:20:30 +10:00
Dave Airlie dc8fc39ada softpipe: add layering to the surface tile cache.
This adds the layer info to the tile cache.

This changes clear_flags to be dynamically allocated as
MAX_LAYERS seems like a too big step.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:20:30 +10:00
Dave Airlie 5a57248541 softpipe: add depth clamping support. (v2)
This passes the piglit depth clamp tests.

this is required for GL 3.2.

v2: move min/max up one level, could go further, thanks
to Roland for suggestion.

v1: Reviewed-by: Brian Paul <brianp@vmware.com>

Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:20:07 +10:00
Dave Airlie a4670de0a0 tgsi/gs: bound max output vertices in shader
This limits the number of emitted vertices to the shaders max output
vertices, and avoids us writing things into memory that isn't big
enough for it.

Reviewed-by: Zack Rusin <zackr@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-06-11 12:19:37 +10:00
Jon Ashburn 10e8d55799 i965: Add GPU BLIT of texture image to PBO in Intel driver
Add Intel driver hook for glGetTexImage to accelerate the case of reading
texture image into a PBO.  This case gets huge performance gains by using
GPU BLIT directly to PBO rather than GPU BLIT to temporary texture followed
by memcpy.

No regressions on Piglit tests  with Intel driver.
Performance gain (1280 x 800 FBO, Ivybridge):
glGetTexImage + glMapBufferRange  with patch 1.45 msec
glGetTexImage + glMapBufferRange without patch 4.68 msec

v3: (by Kenneth Graunke)
 - Fix compile after Eric's change to drop the tiling argument
   to intel_miptree_create_for_bo.
 - Add GL_TEXTURE_3D to blacklisted texture targets to prevent Piglit
   regressions.
 - Squash in several whitespace and coding style fixes.
2014-06-10 18:36:44 -07:00
Kenneth Graunke 237aac39b1 i965: Invalidate live intervals when inserting Gen4 SEND workarounds.
We need to invalidate the live intervals when inserting new
instructions.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2014-06-10 16:38:27 -07:00
Kenneth Graunke ecc78eab11 i965: Don't use the head sentinel as an fs_inst in Gen4 workaround code.
When walking backwards, we want to stop at the head sentinel, which is
where scan_inst->prev->prev == NULL, not scan_inst->prev == NULL.

Fixes random crashes, as well as valgrind errors.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: mesa-stable@lists.freedesktop.org
2014-06-10 16:38:27 -07:00
Kenneth Graunke fc19c4aaf1 meta: Label the meta GLSL clear program.
Giving the meta clear program a meaningful name makes it easier to find
in output such as INTEL_DEBUG=fs or INTEL_DEBUG=shader_time.  We already
did so for integer programs, but neglected to label the primary program.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-06-10 16:38:27 -07:00
Kenneth Graunke 2bcd24c9f0 i965/fs: Combine generate_math[12]_gen6 methods.
These used to call different math emitters (brw_math vs. brw_math2).
Now that they both call gen6_math, they're virtually identical.

When unrolling SIMD16 to multiple SIMD8 operations, we should take care
not to apply sechalf to brw_null_reg for src1.  Otherwise, we'd end up
with BRW_ARF_NULL + 1 as the register number, and I'm not sure if that's
valid.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:27 -07:00
Kenneth Graunke 35e48bd618 i965/fs: Drop the generate_math[12]_gen7 methods.
These functions are basically identical, so we should combine them.
However, they're so trivial, we may as well just fold them into their
only call sites.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke f3ddd71f28 i965/vec4: Combine generate_math[12]_gen6 methods.
These are trivial to combine: we should just avoid checking the second
operand if it's brw_null_reg.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke 5260a26e92 i965/vec4: Drop the generate_math2_gen7() method.
It's now a single line of code, so we may as well fold it into the
caller.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke b003fc265f i965: Rename brw_math to gen4_math.
Usually, I try to use "brw" for functions that apply to all generations,
and "gen4" for dead end/legacy code that is only used on Gen4-5.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke de65ec2fde i965: Split Gen4-5 and Gen6+ MATH instruction emitters.
Our existing functions, brw_math and brw_math2, had unclear roles:

Gen4-5 used brw_math for both unary and binary math functions; it never
used brw_math2.  Since operands are already in message registers, this
is reasonable.

Gen6+ used brw_math for unary math functions, and brw_math2 for binary
math functions, duplicating a lot of code.  The only real difference was
that brw_math used brw_null_reg() for src1.

This patch improves brw_math2's assertions to allow both unary and
binary operations, renames it to gen6_math(), and drops the Gen6+ code
out of brw_math().

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke 7b9cf79790 i965: Make src_reg::equals() take a constant reference, not a pointer.
This is more typical C++ style.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke 000f4a33c0 i965: Don't set the "switch" flag on control flow instructions on Gen6+.
Thread switching on control flow instructions is a documented workaround
for Gen4-5 errata.  As far as I can tell, it hasn't been needed since
Sandybridge.  Thread switching is not free, so in theory this may help
performance slightly.

Flow control instructions with the "switch" flag cannot be compacted, so
removing it will make these instructions compactable.  (Of course, we
still have to implement compaction for flow control instructions...)

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-06-10 16:38:26 -07:00
Kenneth Graunke 3a439534de i965/fs: Allow CSE on math opcodes on Gen6+.
total instructions in shared programs: 2081469 -> 2081248 (-0.01%)
instructions in affected programs:     22606 -> 22385 (-0.98%)
No programs were hurt by this patch.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-06-10 16:38:25 -07:00
Thomas Helland 2c9a1518a1 glsl: Remove unused include in expr.flatt.
Found with IWYU. Compile-tested on my Ivy-bridge system.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:52 -07:00
Thomas Helland 10e00611c2 glsl: Remove unused include in ir.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Thomas Helland 8e1e68119c glsl: Remove unused include from ir_constant_expression.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Thomas Helland 068d30655c glsl: Remove unused include from ir_basic_block.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Thomas Helland b6e68fc9fb glsl: Remove unused include from hir_field_selection.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Thomas Helland 4f5445a45d glsl: Remove unused include from glsl_symbol_table.h
Only function-defs use glsl_type so forward declare instead.
Compile-tested on my Ivy-bridge system.

IWYU also suggests removing #include <new>, and this compiles fine.
I'm not familiar enough with memory management in C/C++ that I feel
comfortable removing this. Insights would be appreciated.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Thomas Helland 38ffbf459b glsl: Remove unused include from glsl_types.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.
Added comment about core.h being used for MAX2.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Thomas Helland 22f5a0b277 glsl: Remove unused include from builtin_variables.cpp
Found with IWYU. Compile-tested on my Ivy-bridge system.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Thomas Helland 6f385d9371 glsl: Remove unused include in ast_to_hir.cpp
Found with IWYU. Comment says it's for struct gl_extensions.
Grepping for gl_extensions shows no uses.
Tested by compiling on my Ivy-bridge system.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Thomas Helland 5b83d5e2f9 glsl: Remove unused includes in link_uniform_block_active_visitor.h
Found with IWYU, compile-tested on my Ivy-bridge system.
This is not used in the header, and is included in the source.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Thomas Helland eac09a4e1d glsl: Remove unused includes in link_uniform_init.
Found with IWYU, confirmed with grepping for "hash" and "symbol".
No negative effects on compilation.

IWYU also reported core.h and linker.h could be removed,
but I'm unsure if those are false positives.

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Thomas Helland <thomashelland90@gmail.com>
2014-06-10 13:05:51 -07:00
Matt Turner 4787c25a60 i965: Replace open-coded linked list with exec_list.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-10 13:05:51 -07:00
Matt Turner 1951418038 glsl: Add an exec_node_init() function, usable from C.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-10 13:05:51 -07:00
Matt Turner b123c6e96d glsl: Make foreach macros usable from C by adding struct keyword.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-10 13:05:51 -07:00
Matt Turner d4ce0109de glsl: Make exec_list members just wrap the C API.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-10 13:05:51 -07:00
Matt Turner b10ad648a1 glsl: Make exec_node members just wrap the C API.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-10 13:05:51 -07:00
Matt Turner d691f0de72 glsl: Add C API for exec_list.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-10 13:05:50 -07:00
Matt Turner 47a77ba839 glsl: Add C API for exec_node.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-10 13:05:50 -07:00
Matt Turner 5f90f2ee59 glsl: Move definition of exec_list member functions out of the struct.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-10 13:05:50 -07:00
Matt Turner cb5a0e59cf glsl: Move definition of exec_node member functions out of the struct.
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2014-06-10 13:05:50 -07:00
Bruno Jiménez 112c1b14ed r600g/compute: Use %u as the unsigned format
This fixes an issue when running cl-program-bitcoin-phatk
piglit test where some of the inputs have negative values

Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-06-10 15:29:57 -04:00