Commit Graph

68129 Commits

Author SHA1 Message Date
Chia-I Wu e8455128aa ilo: update ilo_dsa_state and related functions for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu 9aeee99e4d ilo: update multisample related states for Gen8 2015-02-12 07:56:12 +08:00
Chia-I Wu 6366fbc1a8 ilo: update WM and PS related functions for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu 584d3369b6 ilo: update SBE related functions for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu 4cb592ec17 ilo: update SF related functions for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu 05e2eb57cd ilo: update CLIP related functions for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu 9ab0165375 ilo: update SF_CLIP_VIEWPORT for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu b64aeebbcc ilo: update streamout related functions for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu 6f77bd3bdc ilo: update 3DSTATE_{DS,HS,GS} for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu 3be0504399 ilo: update 3DSTATE_CONSTANT_x for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu 49306afe7b ilo: update 3DSTATE_URB_x for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu d43ae05d76 ilo: update 3DSTATE_PUSH_CONSTANT_ALLOC_x for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu f43332ca2f ilo: update render engine common helpers for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu 8d9f69bef2 ilo: update BLT helpers for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu 574f8d0229 ilo: update MI helpers for Gen8 2015-02-12 07:56:11 +08:00
Chia-I Wu bfc8a72609 ilo: add functions for Gen8 relocs
Extend ilo_builder_writer_reloc() for Gen8 memory addressing.  Add new
wrappers, ilo_builder_surface_reloc64(() and ilo_builder_batch_reloc64().
2015-02-12 07:56:11 +08:00
Chia-I Wu a7911620f6 ilo: update the toy compiler for Gen8
Based on what we know from the classic driver.
2015-02-12 07:56:11 +08:00
Chia-I Wu 0066c22c40 ilo: update genhw headers
Accumulated changes for various renames and additions, including Gen8
definitions.  Some of the dynamic state __SIZE no longer means the size of an
element, but the size of an array of elements.  The changes can be seen in
ilo_render_dynamic.c.
2015-02-12 07:56:10 +08:00
Chia-I Wu 5933d84ad6 ilo: clean up ilo_gpe_init_dsa()
Add dsa_get_stencil_enable_gen6(), dsa_get_depth_enable_gen6(), and
dsa_get_alpha_enable_gen6() to be called from ilo_gpe_init_dsa().
2015-02-12 07:56:10 +08:00
Chia-I Wu aa354b92d2 ilo: clean up ilo_gpe_init_blend()
Make ilo_blend_state more space efficient and forward-looking.
2015-02-12 07:56:10 +08:00
Chia-I Wu 1d07055b50 ilo: clean up sample patterns
Use signed int for sample positions and add helpers to access them.  Call them
patterns instead of positions.
2015-02-12 07:56:10 +08:00
Matt Turner 69ad5fd4ce glsl: Optimize (f2i(trunc x)) into (f2i x).
total instructions in shared programs: 5950326 -> 5949286 (-0.02%)
instructions in affected programs:     88264 -> 87224 (-1.18%)
helped:                                692
2015-02-11 13:50:19 -08:00
Matt Turner c262b2b582 glsl: Optimize round-half-up pattern.
Hurts some Psychonauts shaders, but after the next patch (which this
enables) they're fewer instructions than before this patch.
2015-02-11 13:50:19 -08:00
Matt Turner a5455ab1ca glsl: Add trunc() to ir_builder. 2015-02-11 13:50:19 -08:00
Matt Turner d91390634f i965: Add LINTERP/CINTERP to can_do_cmod().
LINTERP is implemented as a PLN instruction or a LINE+MAC. PLN and MAC
can do conditional mod. CINTERP is just a MOV.

total instructions in shared programs: 5952103 -> 5950284 (-0.03%)
instructions in affected programs:     324573 -> 322754 (-0.56%)
helped:                                1819

We lose the SIMD16 in one Unigine Heaven shader which appears six times
in shader-db.
2015-02-11 13:50:19 -08:00
Matt Turner 245c7848fc program: Remove _mesa_nop_vertex_program/_mesa_nop_fragment_program.
Dead since

   commit 284ce20901
   Author: Eric Anholt <eric@anholt.net>
   Date:   Fri Aug 20 10:52:14 2010 -0700

       Remove remnants of the old glsl compiler.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-11 13:50:19 -08:00
Matt Turner 4c42e1116b nir: Recognize open-coded fmin/fmax.
And unfortunately other shaders do the same thing but with >=/<= which
we can't apply this optimization to because of NaNs.

instructions in affected programs:     23309 -> 22938 (-1.59%)

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-11 13:50:19 -08:00
Eric Anholt 56e21647e2 nir: Add algebraic opt for int comparisons with identical operands.
No change on shader-db on i965.

v2: Reword the comment due to feedback from Erik Faye-Lund

Reviewed-by: Connor Abbott <cwabbott0@gmail.com> (v1)
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com> (v1)
2015-02-11 11:52:38 -08:00
Eric Anholt 2919bdf466 nir: Fix load_const comparisons for CSE.
We want the size of a float per component, not the size of a whole vec4.

NIR instructions on i965:
total instructions in shared programs: 1261937 -> 1261929 (-0.00%)
instructions in affected programs:     114 -> 106 (-7.02%)

Looking at one of these examples (tesseract), it's from vec4 load_consts
for a MRT solid fill, which do get CSEed now that we don't memcmp off the
end of the const value and into the SSA def.  For the 1-component loads
that are common in i965, we were only memcmping off into the rest of the
usually zero-filled const_value.

Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-11 11:52:38 -08:00
Matt Turner 09d6ea9ae3 i965/fs: Remove conditional mod when optimizing a SEL into a MOV.
Missed in commit ca675b73, but got right in the companion commit 3c28b2c0.
2015-02-11 10:26:49 -08:00
Jeremy Huddleston Sequoia e68b67b53f darwin: build fix
xfont.c:237:14: error: implicit declaration of function 'GetGLXDRIDrawable' is invalid in C99 [-Werror,-Wimplicit-function-declaration]
   glxdraw = GetGLXDRIDrawable(CC->currentDpy, CC->currentDrawable);
             ^
Fixes regression from 291be28476

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2015-02-10 22:22:33 -08:00
Jeremy Huddleston Sequoia 1c67a5687a darwin: build fix
../../../src/mesa/main/compiler.h:47:10: fatal error: 'util/macros.h' file not found

Signed-off-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
2015-02-10 20:35:10 -08:00
Matt Turner ea0f0eb6c0 glsl: Optimize 1/exp(x) into exp(-x).
Lots of shaders divide by exp2(...) which we turn into a multiplication
by the reciprocal. We can avoid the reciprocal by simply negating exp2's
argument.

total instructions in shared programs: 5947154 -> 5946695 (-0.01%)
instructions in affected programs:     118661 -> 118202 (-0.39%)
helped:                                380

Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2015-02-10 17:48:44 -08:00
Matt Turner a9065cef48 nir: Remove casts from void*.
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-10 17:48:42 -08:00
Matt Turner bb1e007157 nir: Replace assert(0) with unreachable().
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
2015-02-10 17:48:31 -08:00
Matt Turner 942b56ad05 nir: Remove unused has_indirect variable.
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-02-10 17:48:16 -08:00
Matt Turner fff0b2eab5 i965/vec4: Emit MADs from (x + abs(y * z)).
Same as commit 3654b6d4 to the fs backend.

total instructions in shared programs: 5945788 -> 5945787 (-0.00%)
instructions in affected programs:     36 -> 35 (-2.78%)
helped:                                1

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 17:48:15 -08:00
Matt Turner 3d581f9996 i965/vec4: Emit MADs from (x + -(y * z)).
Same as commit c4fab711 to the fs backend.

total instructions in shared programs: 5945998 -> 5945788 (-0.00%)
instructions in affected programs:     74665 -> 74455 (-0.28%)
helped:                                399
HURT:                                  180

It hurts some programs because we make no attempts in the vec4 backend
to avoid MADs if they have constant (or vector uniform) arguments.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 17:47:37 -08:00
Neil Roberts 5b29b2922a i965/skl: Implement WaDisable1DDepthStencil
Skylake+ doesn't support setting a depth buffer to a 1D surface but it
does allow pretending it's a 2D texture with a height of 1 instead.

This fixes the GL_DEPTH_COMPONENT_* tests of the copyteximage piglit
test (and also seems to avoid a subsequent GPU hang).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89037
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 18:00:21 +00:00
Francisco Jerez 1b224290fb i965/gen7-8: Implement glMemoryBarrier().
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 19:09:25 +02:00
Francisco Jerez 46b03d5400 i965: Generalize the update_null_renderbuffer_surface vtbl hook to non-renderbuffers.
Null surfaces are going to be useful to have something to point
unbound image units to, as the ARB_shader_image_load_store extension
requires us to behave deterministically in cases where some shader
tries to access an unbound image unit: Invalid stores and atomics are
supposed to be discarded and invalid loads are supposed to return
zero, which is precisely what the null surface does.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 19:09:25 +02:00
Francisco Jerez 342b7ce7d4 i965: Allocate binding table space for shader images.
v2: Bump the number of supported image uniforms to 32 (Ken).

Reviewed-by: Paul Berry <stereotype441@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 19:09:25 +02:00
Francisco Jerez 36a17f0f99 i965: Don't tile 1D miptrees.
It doesn't really improve locality of texture fetches, quite the
opposite it's a waste of memory bandwidth and space due to tile
alignment.

v2: Check mt->logical_height0 instead of mt->target (Ken).  Add short
    comment explaining why they shouldn't be tiled.

Reviewed-by: Neil Roberts <neil@linux.intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-02-10 19:09:25 +02:00
Francisco Jerez b40bcd24e0 i965/vec4: Don't set any dependency control bits for F32TO16 on Gen8.
It's expanded to several instructions.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:25 +02:00
Francisco Jerez aef83957e1 i965: Handle negated unsigned immediate values in constant propagation.
Negation of UD/UW sources behaves the same as for D/W sources, taking
the two's complement of the source, except for bitwise logical
operations on Gen8 and up which take the one's complement.  Fixes
crash in a GLSL shader with subtraction of two unsigned values.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:25 +02:00
Francisco Jerez 64fde7b31c i965/vec4: Take into account non-zero reg_offset during register allocation.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:25 +02:00
Francisco Jerez 78e9043475 i965/vec4: Add register classes up to MAX_VGRF_SIZE.
In preparation for some send from GRF instructions that will require
larger payloads.

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:25 +02:00
Francisco Jerez 530445330b i965/vec4: Init mlen for several send from GRF instructions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:25 +02:00
Francisco Jerez 5f878d1b47 i965/vec4: Don't infer MRF dependencies for send from GRF instructions.
Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:24 +02:00
Francisco Jerez de666fc102 i965/vec4: Fix the scheduler to take into account reads and writes of multiple registers.
v2: Avoid nested ternary operators in vec4_instruction::regs_read(). (Matt)

Reviewed-by: Matt Turner <mattst88@gmail.com>
2015-02-10 19:09:24 +02:00