Commit Graph

63965 Commits

Author SHA1 Message Date
Kenneth Graunke 9e47ed2f77 glsl: Make the tree rebalancer use vector_elements, not components().
components() includes matrix columns, so if this code encountered a
matrix, it would ask for something like a vec9 or vec16.  This is
clearly not what you want.

Earlier code now prevents this from seeing matrices, but we should still
use vector_elements, for clarity.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-16 15:43:13 -07:00
Kenneth Graunke 7db75927ca glsl: Guard against error_type in the tree rebalancer.
This helped me track down the bug fixed in the previous commit.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-16 15:43:13 -07:00
Kenneth Graunke 9697f8088f glsl: Make the tree rebalancer bail on matrix operands.
It doesn't handle things like (vector * matrix) correctly, and
apparently Matt's intention was to bail.

Fixes shader compilation in Natural Selection 2.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-16 15:43:13 -07:00
Kenneth Graunke 99f8ea295f Revert "i965: Implement GL_PRIMITIVES_GENERATED with non-zero streams."
This reverts commit 3178d2474a.

This caused GPU hangs on Ivybridge for some users and huge (80%)
performance regressions across the board on multiple platforms.

We need to find a better solution.  I've made several attempts, but none
of them have worked yet.  In the meantime, we should revert this.

Reverting it breaks GL_PRIMITIVES_GENERATED for non-zero streams, but
that's okay, since we don't expose GL_ARB_gpu_shader5 yet.

Fixes Piglit's EXT_transform_feedback/generatemipmap prims_generated
test case on Haswell.
2014-07-16 14:19:29 -07:00
Chia-I Wu 1661f7559b ilo: add some missing formats
Map more pipe formats to hardware formats.  Enable more VB formats on Haswell.
2014-07-16 14:31:59 +08:00
Chia-I Wu 69cd3ebd6f ilo: update and tailor the surface format table
Recreate the table from scratch with the help of a pdf-table-to-csv converter.
Switch to a form that is more suitable for ilo.
2014-07-16 14:31:59 +08:00
Kenneth Graunke a2de656278 i965: Don't copy propagate abs into Broadwell logic instructions.
It's not clear what abs on logical instructions means on Broadwell, and
it doesn't appear to do anything sensible.

Fixes 270 Piglit tests (the bitand/bitor/bitxor tests with abs).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81157
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
2014-07-15 22:12:15 -07:00
Kenneth Graunke cf1b5eee7f i965/fs: Use WE_all for gl_SampleID header register munging.
This code should execute without regard to the currently executing
channels.  Asking for gl_SampleID inside control flow might break in
strange ways.  It appears to break even at the top of the program in
SIMD16 mode occasionally as well.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
2014-07-15 22:10:10 -07:00
Kenneth Graunke e5adc560cc i965/fs: Set force_uncompressed and force_sechalf on samplepos setup.
gen8_fs_generator uses these to decide whether to set the execution size
to 8 or 16, so we incorrectly made both of these MOVs the full width in
SIMD16 shaders.  (It happened to work out on Gen4-7.)

Setting them should also help inform optimization passes what's really
going on, which could help avoid bugs.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
2014-07-15 22:10:06 -07:00
Kenneth Graunke 2eaf3f670f i965: Set execution size to 8 for instructions with force_sechalf set.
Both inst->force_uncompressed and inst->force_sechalf mean that the
generated instruction should be uncompressed and have an execution size
of 8.  We don't require the visitor to set both flags - setting
inst->force_sechalf by itself is supposed to be enough.

On Gen4-7, guess_execution_size() demoted instructions to 8-wide based
on the default compression state.  On Gen8+, we instead set a default
execution size, which worked great...except that we forgot to check
inst->force_sechalf when deciding whether to use 8 or 16.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Chris Forbes <chrisf@ijw.co.nz>
Cc: mesa-stable@lists.freedesktop.org
2014-07-15 22:09:49 -07:00
Christoph Bumiller 4198711006 nvc0: fix translate path for PRIM_RESTART_WITH_DRAW_ARRAYS
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-15 17:57:45 -04:00
Christoph Bumiller a284a0afa2 nvc0: add support for indirect drawing
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-15 17:57:45 -04:00
Ilia Mirkin bbc4a7bd31 nouveau: check if a fence has already been signalled
nouveau_fence_update does real work unconditionally. Avoid doing that if
the fence we're checking on has already been signalled.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-15 17:57:45 -04:00
Matt Turner c11096c749 glsl: Don't declare variables in for-loop declaration.
Reported-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
2014-07-15 12:17:48 -07:00
Connor Abbott 58270c2fac exec_list: Make various places use the new length() method.
Instead of hand-rolling it.

v2 [mattst88]: Rename get_size to length. Expand comment in ir_reader.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
2014-07-15 11:16:16 -07:00
Connor Abbott 7b0f69225a exec_list: Add a function to give the length of a list.
v2 [mattst88]: Remove trailing whitespace. Rename get_size to length.
               Mark as const.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> [v1]
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
2014-07-15 11:16:16 -07:00
Connor Abbott 28c4fd4bc6 exec_list: Add a prepend function.
This complements the existing append function. It's implemented in a
rather simple way right now; it could be changed if performance is a
concern.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Connor Abbott <connor.abbott@intel.com>
2014-07-15 11:16:16 -07:00
Ian Romanick 9a723b970e mesa: Don't allow GL_TEXTURE_{LUMINANCE,INTENSITY}_* queries outside compat profile
There are no queries for GL_TEXTURE_LUMINANCE_SIZE,
GL_TEXTURE_INTENSITY_SIZE, GL_TEXTURE_LUMINANCE_TYPE, or
GL_TEXTURE_INTENSITY_TYPE in any version of OpenGL ES or desktop OpenGL
core profile.

NOTE: Without changes to piglit, this regresses
required-sized-texture-formats.

v2: Rebase on different initial change.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.2 <mesa-stable@lists.freedesktop.org>
2014-07-15 10:46:33 -07:00
Ian Romanick 750286600b mesa: Don't allow GL_TEXTURE_BORDER queries outside compat profile
There are no texture borders in any version of OpenGL ES or desktop
OpenGL core profile.

Fixes piglit's gl-3.2-texture-border-deprecated.

v2: Rebase on different initial change.

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Cc: "10.2 <mesa-stable@lists.freedesktop.org>
2014-07-15 10:46:33 -07:00
Ian Romanick ee58c71a65 mesa: Handle uninitialized textures like other textures in get_tex_level_parameter_image
Instead of catching the special case early, handle it by constructing a
fake gl_texture_image that will cause the values required by the OpenGL
4.0 spec to be returned.

Previously, calling

    glGenTextures(1, &t);
    glBindTexture(GL_TEXTURE_2D, t);
    glGetTexLevelParameteriv(GL_TEXTURE_2D, 0, 0xDEADBEEF, &value);

would not generate an error.

Anuj: Can you verify this does not regress proxy_textures_invalid_size?

Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Suggested-by: Brian Paul <brianp@vmware.com>
Cc: "10.2" <mesa-stable@lists.freedesktop.org>
Cc: Anuj Phogat <anuj.phogat@gmail.com>
2014-07-15 10:46:33 -07:00
Matt Turner 83214edf8a i965/fs: Relax interference check in register coalescing.
A similar attempt was made in commit 5ff1e446 and was reverted in commit
a39428cf after causing a regression in an ES 3 conformance test. The
test still passes after this commit.

total instructions in shared programs: 1994827 -> 1992858 (-0.10%)
instructions in affected programs:     128247 -> 126278 (-1.54%)
GAINED:                                0
LOST:                                  1

Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-15 10:12:29 -07:00
Matt Turner 1d97212007 i965/fs: Perform CSE on sends-from-GRF rather than textures.
Should potentially allow a few more cases, while avoiding doing CSE on
texture operations on Gen <= 6 with the MRF.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80211
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: lu hua <huax.lu@intel.com>
2014-07-15 10:12:29 -07:00
Matt Turner 103716a862 glsl: Update expression types after rebalancing the tree.
If we saw a tree that looked like

            vec3
           /   \
         vec3 float
        /   \
      vec3 float
     /   \
   vec3 float

We would see that all of the expression types were vec3, and then
rebalance to

           vec3
        /        \
      vec3       vec3 <-- should be float
     /   \      /    \
   vec3 float float float

This patch adds code to visit the rebalanced tree and update the
expression types from the bottom up.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80880
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-15 10:12:29 -07:00
Matt Turner 7b962a4e6b glsl: Add callback_leave to ir_hierarchical_visitor. 2014-07-15 10:12:29 -07:00
Matt Turner 76caaedd7e i965: Initialize new chunks of realloc'd memory.
Otherwise we'd compare uninitialized pointers with NULL and dereference,
leading to crashes.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-15 10:12:29 -07:00
Tom Stellard 0d711e719e radeon/llvm: Fix LLVM diagnostic error reporting
We were trying to print the error message after disposing the
message object.

Tested-by and Reviewed-by: Aaron Watry <awatry@gmail.com>
2014-07-15 11:55:26 -04:00
José Fonseca 20b431fd9e util/tgsi: Fix ureg_EMIT/ENDPRIM prototype.
0cbefc1bea added a source argument to
EMIT/ENDPRIM, but it did not update tgsi_ureg accordingly, causing all
users of ureg_EMIT/ENDPRIM to fail at runtime with an assertion failure.

Trivial.
2014-07-15 14:56:31 +01:00
Vinson Lee e945a19b35 glapi: Use GetProcAddress instead of dlsym on Windows.
This patch fixes this MinGW build error.

glapi_gentable.c: In function '_glapi_create_table_from_handle':
glapi_gentable.c:123:9: error: implicit declaration of function 'dlsym' [-Werror=implicit-function-declaration]
         *procp = dlsym(handle, symboln);
         ^

Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Acked-by: Brian Paul <brianp@vmware.com>
2014-07-14 22:21:10 -07:00
Chia-I Wu c25fe88ebf ilo: raise texture size limits
Report the hardware limits now that max-texture-size piglit test has been
fixed.
2014-07-15 12:00:15 +08:00
Chia-I Wu 81d7f33e30 ilo: move away from drm_intel_bo_alloc_tiled
We want to know the exact sizes of the BOs, and the driver has the knowledge
to do so.  Refactoring of the resource allocation code is needed though.
2014-07-15 12:00:10 +08:00
Marek Olšák d859bdb4b5 radeonsi: partially revert "switch descriptors to i32 vectors"
It indeed breaks LLVM 3.4.2.
2014-07-14 21:40:19 +02:00
Matt Turner 130c99ca15 i965/vec4: Invalidate live intervals in opt_cse, not _local.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-14 11:27:52 -07:00
Matt Turner aba15d93a6 i965/vec4: Move aeb list into opt_cse_local.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-14 11:27:52 -07:00
Matt Turner 1ca6b5d2e8 i965/fs: Invalidate live intervals in opt_cse, not _local.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-14 11:27:52 -07:00
Matt Turner bdbaa9ab5b i965/fs: Move aeb list into opt_cse_local.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-14 11:27:52 -07:00
Cody Northrop 0f679f0ab5 glsl: Fix aggregates with dynamic initializers.
Vectors are falling in to the ir_dereference_array() path.

Without this change, the following glsl aborts the debug driver,
or gets the wrong answer in release:

mat2x2 a = mat2( vec2( 1.0, vertex.x ), vec2( 0.0, 1.0 ) );

Also submitting piglit tests, will reference in bug.

v2: Rebase on Mesa master.

v3: Remove unneeded check for arrays, which are covered by
    process_array_constructor(), recommended by Timothy Arceri.

Signed-off-by: Cody Northrop <cody@lunarg.com>
Reviewed-by: Courtney Goeltzenleuchter <courtney@lunarg.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79373
2014-07-14 08:36:36 -07:00
Jon TURNEY 923f78440c Avoid mesa_dri_drivers import lib being installed
On Cygwin and MinGW, linking a shared library also generates an import library

Use a wildcard which also matches the name of the megadriver import lib,
mesa_dri_drivers.dll.a, so that is also removed after megadriver symlinks are
created

(This then matches src/gallium/targets/dri/Makefile.am, which already does
things this way)

Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
2014-07-13 16:06:46 +01:00
Chris Forbes 5899a45a5b i965/vec4: Silence warnings about unhandled interpolation ops
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2014-07-13 11:13:23 +12:00
Chris Forbes 1e4068ca45 docs: Mark off ARB_gpu_shader5 interpolation functions for i965
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
2014-07-13 10:04:25 +12:00
Chris Forbes 9c0bddf735 i965/fs: add support for ir_*_interpolate_at_* expressions
SIMD8-only for now.

V5: - Fix style complaints
    - Move prototype to be with other oddball emit functions
    - Use unreachable() instead of assert() where possible

V6: - Describe what is happening with the clamping
    - Add reg_width to make some expressions clearer

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-13 10:01:24 +12:00
Chris Forbes 5ed147c26f i965/fs: Skip channel expressions splitting for interpolation
The backend will have to do a message send, so we want to keep these in
one piece, just like texture ops.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-13 10:01:22 +12:00
Chris Forbes 6e91f2df95 i965/fs: add generator support for pixel interpolator query
V5: - Split into separate opcodes
    - Pass message data in src1 immediate
    - Put noperspective bit in fs_inst rather than adding any junk to
      backend_instruction

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-13 10:01:18 +12:00
Chris Forbes d732598b63 i965: add low-level support for send to pixel interpolator
Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-13 10:01:17 +12:00
Chris Forbes 0b0572a2ad i965/disasm: add support for pixel interpolator messages
V3: Rework for brw_inst changes

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-13 10:01:16 +12:00
Chris Forbes 1b6163bdf5 i965: Add message descriptor bit definitions for pixel interpolator
These got lost in the big brw_inst shakeup.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2014-07-13 10:01:13 +12:00
Chris Forbes f55e9a7c75 i965/disasm: Disassemble indirect sends more properly
- Don't try to disassemble send's src1 as a descriptor if it's not an
  immediate.

- In the same case, show src1 as an operand (makes it easier to see
  bogus register regions, etc -- the hardware is very fussy)

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-12 11:29:17 +12:00
Chris Forbes 1854ead64c i965: Avoid crashing while dumping vec4 insn operands
We'd otherwise go looking into virtual_grf_sizes for things that aren't
in there at all.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-12 11:29:17 +12:00
Chris Forbes 1499619fe6 i965: Fix two broken asserts in brw_eu_emit
These were looking in the wrong field.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2014-07-12 11:29:09 +12:00
Chris Forbes b45d417108 glsl: add new interpolateAt* builtin functions
V2: - Don't assume everyone wants interpolateAtSample() lowered to
      interpolateAtOffset. It turns out this isn't what we want most
      of the time for i965. Lowering can be added later in an ir pass
      which drivers opt into, rather than bolting it straight into the
      builtin definition.
    - Only expose the interpolateAt* builtins in the fragment language.

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-12 11:20:02 +12:00
Chris Forbes 1d5b06664f glsl: add new expression types for interpolateAt*
Will be used to implement interpolateAt*() from ARB_gpu_shader5

Signed-off-by: Chris Forbes <chrisf@ijw.co.nz>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2014-07-12 11:20:00 +12:00