Commit Graph

66558 Commits

Author SHA1 Message Date
Matt Turner 70fcd56538 i965/vec4: Optimize packSnorm4x8().
Reduces the number of instructions needed to implement packSnorm4x8()
from 13 -> 7.
2014-11-25 17:29:02 -08:00
Matt Turner 3532be7680 i965/vec4: Optimize packUnorm4x8().
Reduces the number of instructions needed to implement packUnorm4x8()
from 11 -> 6.
2014-11-25 17:29:02 -08:00
Matt Turner e14c7c7faf i965/vec4: Add VEC4_OPCODE_PACK_4_BYTES.
Will be used by emit_pack_{s,u}norm_4x8().
2014-11-25 17:29:02 -08:00
Matt Turner 94a30bbd4f i965/vec4: Optimize unpackSnorm4x8().
Reduces the number of instructions needed to implement unpackSnorm4x8()
from 16 -> 6.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 17:29:02 -08:00
Matt Turner bf686b2785 i965/vec4: Optimize unpackUnorm4x8().
Reduces the number of instructions needed to implement unpackUnorm4x8()
from 11 -> 4.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 17:29:02 -08:00
Matt Turner cb0ba848d4 i965/vec4: Add vector float immediate infrastructure.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 17:29:02 -08:00
Matt Turner 5d23721c1d i965/fs: Add vector float immediate infrastructure.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 17:29:02 -08:00
Matt Turner 276075f864 i965: Disassemble vector float immediates properly.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
2014-11-25 17:29:02 -08:00
Matt Turner b2abf033e0 i965: Add unit test for float <-> VF conversions.
Using Eric's original VF -> float conversion code to initialize the
table.
2014-11-25 17:29:02 -08:00
Matt Turner c37d798e78 i965: Add functions to convert float <-> VF.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 17:29:01 -08:00
Chris Forbes 0008d0e59e i965/Gen6-7: Do not replace texcoords with point coord if not drawing points
Fixes broken rendering in Windows-based QtQuick2 apps run through Wine.
This library sets all texture units' GL_COORD_REPLACE, leaves point
sprite mode enabled, and then draws a triangle fan.

Will need a slightly different fix for Gen4-5, but I don't have my old
machines in a usable state currently.

V2: - Simplify patch -- the real changes are no longer duplicated across
      the Gen6 and Gen7 atoms.
    - Also don't clobber attr overrides -- which matters on Haswell too,
      and fixes the other half of the problem
    - Fix newly-introduced warnings
V3: - Use BRW_NEW_GEOMETRY_PROGRAM and brw->geometry_program rather than
      core flag and state; keep the state flags in order.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=84651
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-25 22:38:32 +13:00
Kenneth Graunke 60f011af1a glsl: Make lower_constant_arrays_to_uniforms require dereferences.
Ilia noticed that my lowering pass was converting the constant array
used by textureGatherOffsets' offsets parameter to a uniform.  This
broke textureGather for Nouveau, and is generally a horrible plan,
since it violates the GLSL constraint that offsets must be an
immediate constant.

When I wrote this pass, I neglected to consider whole array assignment.
I figured opt_array_splitting would handle constant indexing, so this
pass was really about fixing variable indexing.

textureGatherOffsets is an example of whole array access that we really
don't want to touch.  Whole array copies don't appear to benefit from
this either - they're most likely initializers for temporary arrays
which are going to be mutated anyway.  Since you're copying, you may
as well copy from immediates, not uniforms.

This patch makes the pass look for ir_dereference_arrays of
ir_constants, rather than looking for any ir_constant directly.
This way, it ignores whole array assignment.

No shader-db changes or Piglit regressions on Haswell.  Some Piglit
tests generate different code (fixing textureGatherOffsets on Nouveau).

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
2014-11-24 15:30:09 -08:00
Kenneth Graunke f0c91f32c0 i965: Precompile ARB programs.
We already precompile GLSL programs; it seems logical to precompile ARB
programs as well.  We just never hooked it up.

This also makes the programs compile even if no drawing occurs, which is
useful for shader-db.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-24 15:30:09 -08:00
Kenneth Graunke b55777f39d i965: Make precompile functions accessible from C.
Previously, the prototypes for brw_vs/gs/fs_precompile were scattered
between brw_vs.h (C), brw_gs.h (C), and brw_fs.h (C++ only).  Also,
brw_fs_precompile had C++ linkage, while the others were C.

This patch moves all the prototypes to a central location (brw_shader.h)
and makes brw_fs_precompile have C linkage.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-24 15:30:09 -08:00
Kenneth Graunke 62b425448c i965: Pass gl_program pointers into precompile functions.
We'd like to do precompiling for ARB vertex and fragment programs,
which only have gl_program structures - gl_shader_program is NULL.

This patch makes the various precompile functions take a gl_program
parameter directly, rather than accessing it via gl_shader_program.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-24 15:30:09 -08:00
Kenneth Graunke d54925df9c i965: Move brw->precompile checks out a level.
brw_shader_precompile should just do a precompile; it makes more sense
for the caller to decide whether we should do one.  Simpler.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-11-24 15:30:09 -08:00
Roland Scheidegger 880424b8ad llvmpipe: (trivial) remove redundant util_cpu_detect() call in lp_test_main
Already called earlier.
2014-11-25 00:29:29 +01:00
Roland Scheidegger 8148a06b8f llvmpipe: fix lp_test_arit denorm handling
llvmpipe disables denorms on purpose (on x86/sse only), because denorms are
generally neither required nor desired for graphic apis (and in case of d3d10,
they are forbidden).
However, this caused some arithmetic tests using denorms to fail on some
systems, because the reference did not generate the same results anymore.
(It did not fail on all systems - behavior of these math functions is sort
of undefined when called with non-standard floating point mode, hence the
result differing depending on implementation and in particular the sse
capabilities.)
So, for the reference, simply flush all (input/output) denorms manually
to zero in this case.

This fixes https://bugs.freedesktop.org/show_bug.cgi?id=67672.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-25 00:29:29 +01:00
Eric Anholt 93d30ff5d6 nouveau: Fix build after STR/BRA opcode dropping.
I missed these while git grepping for users of the dead opcodes.  Sigh,
macros.
2014-11-24 15:22:25 -08:00
Eric Anholt a3688d686f mesa: Drop unused NV_fragment_program opcodes.
The extension itself was deleted 2 years ago.  There are still some
prog_instruction opcodes from NV_fp that exist because they're used by
ir_to_mesa.cpp, though.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Roamnick <ian.d.romanick@intel.com>
2014-11-24 14:56:22 -08:00
Eric Anholt 868f95f1da mesa: Drop unused SFL/STR opcodes.
They're part of NV_vertex_program2, which I'm pretty sure we're never
going to support.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Ian Roamnick <ian.d.romanick@intel.com>
2014-11-24 14:56:22 -08:00
Eric Anholt 365a4a3f9a gallium: Drop the unused CND opcode.
Nothing in the tree generates it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt 00f7002c5c gallium: Drop unused BRA opcode.
Never generated, and implemented in only nvfx vertprog.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt ecfe9e2ad2 gallium: Drop the unused SFL/STR opcodes.
Nothing generated them.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt dc00b382b5 gallium: Drop the unused RFL opcode.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt 8c822b1e91 gallium: Drop unused X2D opcode.
Nothing in the tree generates it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt ff886c4955 gallium: Drop the unused ARA opcode.
Nothing in the tree generated it.

v2: Only drop ARA, not ARR as well.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com> (v2)
2014-11-24 14:56:22 -08:00
Eric Anholt de2f8d75db gallium: Drop the unused RCC opcode.
Nothing in the tree generated it.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt d4864cdf15 gallium: Drop the NRM and NRM4 opcodes.
They weren't generated in tree, and as far as I know all hardware had to
lower it to a DP, RSQ, MUL.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
2014-11-24 14:56:22 -08:00
Eric Anholt 7361d5ba63 ilo: Drop the explicit intialization of gaps in TGSI opcodes.
The nice thing about the good way of initializing arrays like this is that
you don't need to initialize everything in order, or even everything at
all.  Taking advantage of that only needs a tiny fixup to deal with the
default NULL value of the pointers.

I haven't dropped the initialization of opcodes that exist and are unsupported.
2014-11-24 14:56:22 -08:00
Eric Anholt 386c3fcb14 r300: Drop the "/* gap */" notes.
This switch statement's code structure isn't dependent on the numbers of
the opcodes at all.
2014-11-24 14:56:22 -08:00
Eric Anholt 2f01cc8417 r600: Drop the "/* gap */" notes.
These are obviously the gaps already, due to the bare numbers with
unsupported implementations.

This makes inserting new gaps less irritating.
2014-11-24 14:56:22 -08:00
Jose Fonseca 925cb75f89 nine: Drop use of TGSI_OPCODE_CND.
This was the only state tracker emitting it, and hardware was just having
to lower it anyway (or failing to lower it at all).

v2: Extracted from a larger patch by Jose (which also dropped DP2A), fixed
    to actually not reference TGSI_OPCODE_CND.  Change by anholt.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2014-11-24 14:56:22 -08:00
Jose Fonseca 56fd7c6361 nine: Don't reference the dead TGSI_OPCODE_NRM.
The translation is lowering it to not using TGSI_OPCODE_NRM, anyway.

v2: Extracted from a larger patch by Jose that also dropped DP2A usage.

Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2014-11-24 14:56:22 -08:00
Eric Anholt 7c0acd8535 nine: Don't use the otherwise-dead SFL opcode in an unreachable path.
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: David Heidelberg <david@ixit.cz>
2014-11-24 14:56:21 -08:00
Matt Turner 057e6e5251 i965/gen6/gs: Don't declare a src_reg with struct.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 14:09:23 -08:00
Matt Turner ff966aff99 i965/disasm: Fix all32h/any32h predicate disassembly.
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
2014-11-24 14:09:23 -08:00
Matt Turner b754e52532 glsl: Fix tautological comparison.
Caught by clang.

warning: comparison of constant -1 with expression of type
         'ir_texture_opcode' is always false
      [-Wtautological-constant-out-of-range-compare]
      if (op == -1)
          ~~ ^  ~~

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 14:09:23 -08:00
Matt Turner 024db256d4 util: Prefer atomic intrinsics to inline assembly.
Cuts a little more than 1k of .text size from i915g.

This was previously done in commit 5f66b340 and subsequently reverted in
commit 3661f757 after bug 30514 was filed. I believe the cause of bug
30514 wasn't anything related to cross compiling, but rather that the
toolchain used defaulted to -march=i386, and i386 doesn't have the
CMPXCHG or XADD instructions used to implement the intrinsics.

So we reverted a patch that improved things so that we didn't break
compilation for a platform that never could have worked anyway.
2014-11-24 14:09:23 -08:00
Matt Turner 99cebffda9 util: Implement assume() for clang.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
2014-11-24 14:09:23 -08:00
Matt Turner 56ac25918a i965: Don't overwrite the math function with conditional mod.
Ben was asking about the undocumented restriction that the math
instruction cannot use the dependency control hints. I went to reconfirm
and disabled the is_math() check in opt_set_dependency_control() and saw
that the disassembled math instructions with dependency hints had a
bogus math function. We were mistakenly overwriting it by setting an
empty conditional mod.

Unfortunately, this wasn't the cause of the aforementioned problem (I
reproduced it). This bug is benign, since we don't set dependeny hints
on math instructions -- but maybe some day.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 14:07:32 -08:00
Matt Turner f5bef2d2e5 i965: Assert that math instructions don't have conditional mod.
The math function field is at the same location as conditional mod.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 14:06:39 -08:00
Matt Turner 803a744507 glsl: Remove unused ast copy constructors.
These were added in commits a760c738 and 43757135 to be used in
implementing C-style aggregate initializers (commit 1b0d6aef). Paul
rewrote that code in commit 0da1a2cc to use GLSL types, rather than
AST types, leaving these copy constructors unused.

Tested by making them private and providing no definition.
2014-11-24 14:06:39 -08:00
Matt Turner baff470823 glapi: Remove dead gl_offsets.py.
Dead since commit 07b85457.
2014-11-24 14:02:54 -08:00
Matt Turner 76ef547be7 glapi: Remove dead extension_helper.py.
Dead since commit 3d16088f.
2014-11-24 14:02:54 -08:00
Eric Anholt 52a7cb2ec4 vc4: Fix some inconsistent indentation. 2014-11-24 12:37:33 -08:00
Eric Anholt 6f4adb7483 vc4: Don't forget to actually connect the fence code.
I thought I'd tested this.
2014-11-24 12:37:33 -08:00
Eric Anholt fa74ec7e98 vc4: Add a note about a piece of errata I've learned about.
Right now in my environment I've only got a small CMA area, so this
constraint ends up holding.
2014-11-24 12:37:33 -08:00
Chris Forbes 2b4fe85f0e mesa: Fix Get(GL_TRANSPOSE_CURRENT_MATRIX_ARB) to transpose
This was just returning the same value as GL_CURRENT_MATRIX_ARB.
Spotted while investigating something else in apitrace.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.3 10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 21:55:47 +13:00
Chris Forbes 129178893b glsl: Generate unique names for each const array lowered to uniforms
Uniform names (even for hidden uniforms) are required to be unique; some
parts of the compiler assume they can be looked up by name.

Fixes the piglit test: tests/spec/glsl-1.20/linker/array-initializers-1

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: "10.4" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-11-24 21:07:56 +13:00