Commit Graph

63977 Commits

Author SHA1 Message Date
Brian Paul 282b783a15 st/mesa: add some missing MESA/PIPE_FORMAT_R10G10B10A2_UNORM switch cases
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-09 15:06:46 -06:00
Carl Worth 0e12cd7954 glsl/glcpp: Don't choke on an empty pragma
The lexer was insisting that there be at least one character after "#pragma"
and before the end of the line. This caused an error for a line consisting
only of "#pragma" which volates at least the following sentence from the GLSL
ES Specification 3.00.4:

	The scope as well as the effect of the optimize and debug pragmas is
	implementation-dependent except that their use must not generate an
	error. [Page 12 (Page 28 of PDF)]

and likely the following sentence from that specification and also in
GLSLangSpec 4.30.6:

	If an implementation does not recognize the tokens following #pragma,
	then it will ignore that pragma.

Add a "make check" test to ensure no future regressions.

This change fixes at least part of the following Khronos GLES3 CTS test:

	preprocessor.pragmas.pragma_vertex

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-09 12:05:14 -07:00
Carl Worth 43047384c3 glsl/glcpp: Promote "extra token at end of directive" from warning to error
We've always warned about this case, but a recent confromance test expects
this to be an error that causes compilation to fail. Make it so.

Also add a "make check" test to ensure these errors are generated.

This fixes the following Khronos GLES3 conformance tests:

	invalid_conditionals.tokens_after_ifdef_vertex
	invalid_conditionals.tokens_after_ifdef_fragment
	invalid_conditionals.tokens_after_ifndef_vertex
	invalid_conditionals.tokens_after_ifndef_fragment

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-09 12:05:14 -07:00
Carl Worth dac3c986c5 glsl/glcpp: Once again report undefined macro name in error message.
While writing the previous commit message, I just felt bad documenting the
shortcoming of the change, (that undefined macro names would not be reported
in error messages).

Fix this by preserving the first-encounterd undefined macro name and reporting
that in any resulting error message.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-09 12:05:13 -07:00
Carl Worth ec6222ef01 glsl/glcpp: Add short-circuiting for || and && in #if/#elif for OpenGL ES.
The GLSL ES Specification 3.00.4 says:

	#if, #ifdef, #ifndef, #else, #elif, and #endif are defined to operate
        as for C++ except for the following:
	...
	• Undefined identifiers not consumed by the defined operator do not
	  default to '0'. Use of such identifiers causes an error.

	[Page 11 (page 127 of the PDF file)]

as well as:

	The semantics of applying operators in the preprocessor match those
	standard in the C++ preprocessor with the following exceptions:

	• The 2nd operand in a logical and ('&&') operation is evaluated if
	  and only if the 1st operand evaluates to non-zero.

	• The 2nd operand in a logical or ('||') operation is evaluated if
	  and only if the 1st operand evaluates to zero.

	If an operand is not evaluated, the presence of undefined identifiers
	in the operand will not cause an error.

(Note that neither of these deviations from C++ preprocessor behavior apply to
non-ES GLSL, at least as of specfication version 4.30.6).

The first portion of this, (generating an error for an undefined macro in an
(short-circuiting to squelch errors), was not implemented previously, but is
implemented in this commit.

A test is added for "make check" to ensure this behavior.

Note: The change as implemented does make the error message a bit less
precise, (it just states that an undefined macro was encountered, but not the
name of the macro).

This commit fixes the following Khronos GLES3 conformance test:

	undefined_identifiers.valid_undefined_identifier_1_vertex
	undefined_identifiers.valid_undefined_identifier_1_fragment
	undefined_identifiers.valid_undefined_identifier_2_vertex
	undefined_identifiers.valid_undefined_identifier_2_fragment

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-09 12:05:13 -07:00
Carl Worth 9794f8f245 glsl/glcpp: Fix glcpp to properly lex entire "preprocessing numbers"
The preprocessor defines a notions of a "preprocessing number" that
starts with either a digit or a decimal point, and continues with zero
or more of digits, decimal points, identifier characters, or the sign
symbols, ('-' and '+').

Prior to this change, preprocessing numbers were lexed as some
combination of OTHER and IDENTIFIER tokens. This had the problem of
causing undesired macro expansion in some cases.

We add tests to ensure that the undesired macro expansion does not
happen in cases such as:

	#define e +1
	#define xyz -2

	int n = 1e;
	int p = 1xyz;

In either case these macro definitions have no effect after this
change, so that the numeric literals, (whether valid or not), will be
passed on as-is from the preprocessor to the compiler proper.

This fixes the following Khronos GLES3 CTS tests:

	preprocessor.basic.correct_phases_vertex
	preprocessor.basic.correct_phases_fragment

v2. Thanks to Anuj Phogat for improving the original regular expression,
(which accepted a '+' or '-', where these are only allowed after one of
[eEpP]. I also expanded the test to exercise this.

v3. Also fixed regular expression to require at least one digit at the
beginning (after an optional period). Otherwise, a string such as ".xyz" was
getting sucked up as a preprocessing number, (where obviously this should be a
field access). Again, I expanded the test to exercise this.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-07-09 12:05:13 -07:00
Carl Worth 98c0e3c783 glsl/glcpp: Fix glcpp to catch garbage after #if 1 ... #else
Previously, a line such as:

	#else garbage

would flag an error if it followed "#if 0", but not if it followed "#if 1".

We fix this by setting a new bit of state (lexing_else) that allows the lexer
to defer switching to the <SKIP> start state until after the NEWLINE following
the #else directive.

A new test case is added for:

	#if 1
	#else garbage
	#endif

which was untested before, (and did not generate the desired error).

This fixes the following Khronos GLES3 CTS tests:

	tokens_after_else_vertex
        tokens_after_else_fragment

Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-07-09 12:05:13 -07:00
Carl Worth 1d862a0b39 glsl/glcpp: Fixup glcpp tests for redefining a macro with whitespace changes.
Previously, the test suite was expecting the compiler to allow a redefintion
of a macro with whitespace added, but gcc is more strict and allows only for
changes in the amounts of whitespace, (but insists that whitespace exist or
not in exactly the same places).

See: https://gcc.gnu.org/onlinedocs/cpp/Undefining-and-Redefining-Macros.html:

 These definitions are effectively the same:

      #define FOUR (2 + 2)
      #define FOUR         (2    +    2)
      #define FOUR (2 /* two */ + 2)

 but these are not:

      #define FOUR (2 + 2)
      #define FOUR ( 2+2 )
      #define FOUR (2 * 2)
      #define FOUR(score,and,seven,years,ago) (2 + 2)

This change adjusts the existing "redefine-macro-legitimate" test to work with
the more strict understanding, and adds a new "redefine-whitespace" test to
verify that changes in the position of whitespace are flagged as errors.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-07-09 12:05:13 -07:00
Anuj Phogat a6e9cd14ca glsl/glcpp: Fix preprocessor error condition for macro redefinition
This patch specifically fixes redefinition condition for white space
changes. #define and #undef functionality in GLSL follows the standard
for C++ preprocessors for macro definitions.

From https://gcc.gnu.org/onlinedocs/cpp/Undefining-and-Redefining-Macros.html:

These definitions are effectively the same:

     #define FOUR (2 + 2)
     #define FOUR         (2    +    2)
     #define FOUR (2 /* two */ + 2)

but these are not:

     #define FOUR (2 + 2)
     #define FOUR ( 2+2 )
     #define FOUR (2 * 2)
     #define FOUR(score,and,seven,years,ago) (2 + 2)

Fixes Khronos GLES3 CTS tests;
invalid_object_whitespace_vertex
invalid_object_whitespace_fragment

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
2014-07-09 12:05:13 -07:00
Carl Worth 1a46dd6edd glsl/glcpp: Add test to ensure compiler won't allow #undef for some builtins
Currently verifying that an #undef of __FILE__, __LINE__, or __VERSION__ will
generate an error.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
2014-07-09 12:05:13 -07:00
Anuj Phogat 64b7fc2dd1 glsl/glcpp: Do not allow undefining the built-in macros
Fixes piglit tests in spec/glsl-es-3.00/compile:
undef-__FILE__.vert
undef-GL_ES.vert
undef-__LINE__.vert
undef-__VERSION__.vert

Also, fixes Khronos GLES3 CTS tests:
undefine_invalid_object_1_vertex
undefine_invalid_object_1_fragment
undefine_invalid_object_2_vertex
undefine_invalid_object_2_fragment

Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Carl Worth <cworth@cworth.org>
2014-07-09 12:05:13 -07:00
Brian Paul 378fa34c7b gallium/u_blitter: fix some shader memory leaks
The _msaa shaders weren't getting freed.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-09 12:15:35 -06:00
Ilia Mirkin e924bb32f4 tgsi: properly parse indirect dimension references (e.g. for UBOs)
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-07-09 12:40:07 -04:00
Christian König c8011c1885 radeonsi: fix order of r600_need_dma_space and r600_context_bo_reloc
Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-09 15:08:22 +02:00
Brian Paul d10204930f st/mesa: fix geometry shader memory leak
Spotted by Charmaine Lee.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2014-07-09 06:43:26 -06:00
Brian Paul 176b64b811 mesa: fix geometry shader memory leaks
Spotted by Charmaine Lee.
Cc: "10.2" <mesa-stable@lists.freedesktop.org>

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-09 06:43:26 -06:00
Brian Paul 971122a9c0 st/mesa: minor simplification of some state atom assignments 2014-07-09 06:43:25 -06:00
Brian Paul 301ffe7b26 st/mesa: minor fix-up in st_GetSamplePosition()
If the driver doesn't implement get_sample_position(), let's return
some non-garbage values.
2014-07-09 06:43:25 -06:00
Brian Paul 91affc8b32 mesa: use float to silence MSVC warning in _mesa_GetMultisamplefv() 2014-07-09 06:43:25 -06:00
Samuel Pitoiset 50bbe49c33 nvc0: allocate more space before a counter is configured
On nvc0, a counter can have up to 6 sources instead of only one
for nve4+. This fixes a crash when a counter uses more than
one source.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-08 19:41:00 -04:00
Tobias Klausmann a9b21015f5 nv50/ir: use unordered_set instead of list to keep track of var uses
The set of variable uses does not need to be ordered in any way, and
removing/adding elements is a fairly common operation in various
optimization passes.

This shortens runtime of piglit test fp-long-alu to ~22s from ~4h

Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-08 19:41:00 -04:00
Kenneth Graunke 503391b46f i965/disasm: Fix disassembly of the any16h/all16h predicates.
BRW_PREDICATE_ALIGN1_ANY16H was incorrectly being disassembled as
"all16h", and ALL16H would probably print as "(null)".

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-08 12:31:01 -07:00
Kenneth Graunke e13a6406c3 glsl: Fix the foreach_in_list_reverse macro.
We clearly don't want to start at the head and walk backwards; we want
to start at the last real element before the tail sentinel.  If the list
is empty, tail_pred will be the head sentinel, and we'll stop.

Nothing uses this function, so I guess nobody noticed it was broken.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-08 12:31:01 -07:00
Marek Olšák be536efe20 radeonsi: mark MSAA config state as dirty at the beginning of CS
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81020

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2014-07-08 20:46:23 +02:00
Marek Olšák fe6be9926f gallium: fix u_default_transfer_inline_write for textures
This doesn't fix any known issue. In fact, radeon drivers ignore all
the discard flags for textures and implicitly do "discard range"
for any write transfer.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2014-07-08 20:46:23 +02:00
Matt Turner cf430408c4 i965: Remove artificial dependency between math instructions.
... on Gen6+. I'm not actually sure which class Gen6 fits into.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-08 11:12:02 -07:00
Matt Turner 099cbc1477 i965/fs: Track dependencies in instruction scheduling per reg offset.
Previously instruction scheduling tracked dependencies on a per-register
basis. This meant that there was an artificial dependency between
interpolation instructions writing into the same virtual register.

Instruction scheduling would insert a number of instructions between the
two instructions in this example, when they are actually independent.

   linterp vgrf8+0.0:F, hw_reg2:F, hw_reg3:F, hw_reg6:F
   linterp vgrf8+1.0:F, hw_reg2:F, hw_reg3:F, hw_reg6+16:F

This lead to cases where the first texture coordinate is interpolated at
the beginning of the shader, but the second is done immediately before
the texture operation that uses it as a source.

After this change, the artificial dependency is removed and the
interpolation instructions are scheduled together.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-08 11:12:02 -07:00
Jon TURNEY 7a641dd58d configure: Don't special case Cygwin to use gnu99, define _XOPEN_SOURCE instead
Revert "build: Build on Cygwin with gnu99 instead of c99." and define
_XOPEN_SOURCE appropriately.

This reverts commit 53e36d333c.

Since Cygwin 1.7.18 (April 2013), it's headers correctly prototype strtoll()
when using -std=c99, and correctly prototype strdup() when _XOPEN_SOURCE is
defined appropriately, so this workaround is no longer needed.

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Cc: Vinson Lee <vlee@freedesktop.org>
2014-07-08 14:25:21 +01:00
Chia-I Wu 8ff16111ee ilo: fix fence reference counting
The old code was complicated, and was wrong when *ptr is NULL.
2014-07-08 15:00:36 +08:00
Kristian Høgsberg bbefb15e01 i965: Extend compute-to-mrf pass to understand blocks of MOVs
The current compute-to-mrf pass doesn't handle blocks of MOVs.  Shaders
that end with a texture fetch follwed by an fb write are left like this:

0x00000000: pln(8)          g6<1>F          g4<0,1,0>F      g2<8,8,1>F      { align1 WE_normal 1Q compacted };
0x00000008: pln(8)          g7<1>F          g4.4<0,1,0>F    g2<8,8,1>F      { align1 WE_normal 1Q compacted };
0x00000010: send(8)         g2<1>UW         g6<8,8,1>F
                            sampler (1, 0, 0, 1) mlen 2 rlen 4              { align1 WE_normal 1Q };
0x00000020: mov(8)          g113<1>F        g2<8,8,1>F                      { align1 WE_normal 1Q compacted };
0x00000028: mov(8)          g114<1>F        g3<8,8,1>F                      { align1 WE_normal 1Q compacted };
0x00000030: mov(8)          g115<1>F        g4<8,8,1>F                      { align1 WE_normal 1Q compacted };
0x00000038: mov(8)          g116<1>F        g5<8,8,1>F                      { align1 WE_normal 1Q compacted };
0x00000040: sendc(8)        null            g113<8,8,1>F
                            render ( RT write, 0, 4, 12) mlen 4 rlen 0      { align1 WE_normal 1Q EOT };

This patch lets compute-to-mrf recognize blocks of MOVs and match them to
instructions (typically SEND) that writes multiple registers.  With this,
the above shader becomes:

0x00000000: pln(8)          g6<1>F          g4<0,1,0>F      g2<8,8,1>F      { align1 WE_normal 1Q compacted };
0x00000008: pln(8)          g7<1>F          g4.4<0,1,0>F    g2<8,8,1>F      { align1 WE_normal 1Q compacted };
0x00000010: send(8)         g113<1>UW       g6<8,8,1>F
                            sampler (1, 0, 0, 1) mlen 2 rlen 4              { align1 WE_normal 1Q };
0x00000020: sendc(8)        null            g113<8,8,1>F
                            render ( RT write, 0, 20, 12) mlen 4 rlen 0     { align1 WE_normal 1Q EOT };

which is the bulk of the shader db results:

total instructions in shared programs: 987040 -> 986720 (-0.03%)
instructions in affected programs:     844 -> 524 (-37.91%)
GAINED:                                0
LOST:                                  0

The optimization also applies to MRT shaders that write the same
color value to multiple RTs, in which case we can eliminate four MOVs in
a similar fashion.  See fbo-drawbuffers2-blend in piglit for an example.

No measurable performance impact.  No piglit regressions.

Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
2014-07-07 23:39:40 -07:00
Ilia Mirkin 8aa34dc9cb nvc0/ir: fill offset in properly for TXD
Apparently TXD wants its offset differently than TEX, accepting it in
the upper bits of the layer index. Unclear what happens when this is
combined with indirect sampler indexing.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-08 00:14:33 -04:00
Ilia Mirkin 114d46829d nvc0/ir: use manual TXD when offsets are involved
Something about how we're implementing offsets for TXD is wrong, just
flip to the generic quadop-based implementation in that case.

This is the minimal fix appropriate for backporting.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
2014-07-08 00:14:33 -04:00
Ilia Mirkin afea9bae67 nvc0/ir: do quadops on the right texture coordinates for TXD
handleTEX moves the layer as the first argument. This makes sure that
the quadops deal with the texture coordinates.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
2014-07-08 00:14:33 -04:00
Ilia Mirkin 1065aa92f4 nv50/ir: ignore bias for samplerCubeShadow on nv50
Unfortunately there's no good way to do this on the nv50 shader isa.
Dropping the bias seems preferable to doing the compare post-filtering.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
2014-07-08 00:14:33 -04:00
Ilia Mirkin 30d91e0eec nv50/ir: retrieve shadow compare from first arg
This can only happen with texture(samplerCubeShadow, bias), where the
compare will be in the first argument.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: <mesa-stable@lists.freedesktop.org>
2014-07-08 00:14:33 -04:00
Carl Worth 9007c4f9f4 docs: Import 10.2.3 release notes
And add a news item.
2014-07-07 16:28:37 -07:00
Matt Turner f6db414f3c i965/fs: Disable unlit_centroid_workaround on Haswell.
Although the HSW PRM shows it, the BSpec lists this workaround as being
for Ivybridge only.

total instructions in shared programs: 1994951 -> 1993675 (-0.06%)
instructions in affected programs:     27325 -> 26049 (-4.67%)
2014-07-06 18:19:17 -07:00
Matt Turner 6f7c4a8d05 i965/vec4: Perform CSE on CMP(N) instructions.
Port of commit b16b3c87 to the vec4 code.

No shader-db improvements, but might as well. The fs backend saw an
improvement because it's scalar and multiple identical CMP instructions
were generated by the SEL peepholes.
2014-07-06 18:19:15 -07:00
Matt Turner 7921bf0062 i965/vec4: Don't emit null MOVs in CSE.
Port of commit 219b43c6 to the vec4 code.
2014-07-06 18:18:52 -07:00
Matt Turner 949991cc99 i965/vec4: Improve CSE performance by expiring some available expressions.
Port of commit 5daf867f to the vec4 code.
2014-07-06 18:18:52 -07:00
Kenneth Graunke 3c8dc48ad1 i965/vec4: Add basic common subexpression elimination.
[mattst88]: Modified to perform CSE on instructions with
            the same writemask. Offered no improvement before.

total instructions in shared programs: 1995633 -> 1995185 (-0.02%)
instructions in affected programs:     14410 -> 13962 (-3.11%)

Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-06 18:18:51 -07:00
Matt Turner 848fc7f710 i965: Fix warnings introduced in commit e24ef5ab.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-06 18:15:36 -07:00
Christian König 042b061fef gallium/radeon: use PRIX64 instead of PRIu64
We want hex values here, not decimals.

Signed-off-by: Christian König <christian.koenig@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-07-06 13:28:04 +02:00
Matt Turner 1580865a8c i965: Move assembly annotation functions to intel_asm_annotation.c.
It's C. Compile it as such.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-05 22:42:30 -07:00
Matt Turner 423932791d i965: Rename intel_asm_printer -> intel_asm_annotation.
The #ifndef include guards already said the right thing :)

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-05 22:42:30 -07:00
Matt Turner 6d3e24a5c2 i965: Make backend_instruction usable from C.
With a hack to place an exec_node in the struct in C to be at the same
location as the inherited exec_node in C++.

Acked-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-05 22:42:30 -07:00
Matt Turner 0db30fcf89 i965/cfg: Make cfg_t usable from C.
Acked-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-05 22:42:30 -07:00
Matt Turner 857c06236c i965: Repack backend_instruction struct.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-05 22:42:30 -07:00
Matt Turner ce706b4a9b i965: Make a brw_predicate enum.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-05 22:42:30 -07:00
Matt Turner 46e5b2a497 i965: Make a brw_conditional_mod enum.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
2014-07-05 22:42:30 -07:00