Commit Graph

72341 Commits

Author SHA1 Message Date
Marek Olšák 16e5d8ad38 radeonsi: add IB parser support for CP DMA packets
If the packet encoding is defined in the same format as register definitions,
the python script can process them automatically and the parser support
becomes trivial.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák 2c14a6d3b1 radeonsi: add IB tracing support for debug contexts
This adds trace points to all IBs and the parser prints them and also
prints which trace points were reached (executed) by the CP.
This can help pinpoint a problematic packet, draw call, etc.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák 189953ee13 radeonsi: remove old CS tracing code
Some of it is left there and it will be re-used in the next commit.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák df6a5666b6 radeonsi: parse and dump status registers on GPU hang
GPU hang detection must be enabled by setting: GALLIUM_DDEBUG=[timeout in ms]

This may print too much information that we might not understand yet,
but some of the bits are very useful.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák 61df4f0cd3 radeonsi: add an IB parser
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák be6dc87776 radeonsi: save the contents of indirect buffers for debug contexts
This will be used by the IB parser.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák a6a6c68955 radeonsi: generate register and packet tables for an IB parser from sid.h
This makes writing a good IB parser a lot easier.

It generates 2 tables:
- packet3 table
- register table with all registers, fields, and named values

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák d15b71b4bd radeonsi: remove duplicated register definitions and instruction definitions
Instruction encoding isn't needed in Mesa.

The border color address registers were duplicated.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:19 +02:00
Marek Olšák c59ad265df r600g,radeonsi: remove unused ill-formed register field definitions
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Marek Olšák 110873ed11 radeonsi: add an initial dump_debug_state implementation dumping shaders
This is usually called after a draw call.

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Marek Olšák 93d97db349 radeonsi: allow si_dump_key to write to a file
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Marek Olšák 525921ed51 gallium/ddebug: new pipe for hang detection and driver state dumping (v2)
v2: lots of improvements

This is like identity or trace, but simpler. It doesn't wrap most states.

Run with:
  GALLIUM_DDEBUG=1000 [executable]
where "executable" is the app and "1000" is in miliseconds, meaning that
the context will be considered hung if a fence fails to signal in 1000 ms.

If that happens, all shaders, context states, bound resources, draw
parameters, and driver debug information (if any) will be dumped into:
  /home/$username/dd_dumps/$processname_$pid_$index.

Note that the context is flushed after every draw/clear/copy/blit operation
and then waited for to find the exact call that hangs.

You can also do:
  GALLIUM_DDEBUG=always
to do the dumping after every draw/clear/copy/blit operation without
flushing and waiting.

Examples of driver states that can be dumped are:
- Hardware status registers saying which hw block is busy (hung).
- Disassembled shaders in a human-readable form.
- The last submitted command buffer in a human-readable form.

v2: drop pipe-loader changes, drop SConscript
    rename dd.h -> dd_pipe.h

Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Marek Olšák 0fc21ecfc0 gallium: add flags parameter to pipe_screen::context_create
This allows creating compute-only and debug contexts.

Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Marek Olšák 7b5c92391f gallium: add an interface for dumping debug driver state
Reviewed-by: Brian Paul <brianp@vmware.com>
Acked-by: Christian König <christian.koenig@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
2015-08-26 19:25:18 +02:00
Ilia Mirkin a3b617a258 mesa: remove pointless es31 checks, fix indirect to only be in es31
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-08-26 12:37:38 -04:00
Ilia Mirkin 332fb341dd mesa: uncomment checks in es31 computation, add texture_ms
Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
2015-08-26 12:37:17 -04:00
Marek Olšák f432ae899f mesa: create multisample fallback textures like normal textures
This works if drivers upsample on upload (like all radeon ones do).
The alternative is an unexpected GL error from anything calling
_mesa_update_state and possibly other issues.

Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Dave Airlie <airlied@redhat.com>
2015-08-26 15:42:26 +02:00
Grazvydas Ignotas f8b01ae47c radeonsi: mark unreachable paths to avoid warnings
Otherwise we get:
warning: 'num_user_sgprs' may be used uninitialized in this function
...

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-08-26 15:42:26 +02:00
Tapani Pälli e0c2ea0337 mesa: GetTexLevelParameter{if}v changes for OpenGL ES 3.1
Patch refactors existing parameters check to first check common enums
between desktop GL and GLES 3.1 and modifies get_tex_level_parameter_image
to be compatible with enums specified in 3.1.

v2: remove extra is_gles31() checks (suggested by Ilia)

Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> (v1)
Reviewed-by: Marta Lofstedt <marta.lofstedt@intel.com> (v1)
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
2015-08-26 08:38:25 +03:00
Marta Lofstedt ae8d0e7abe mesa/es3.1: Allow GL_COMPUTE_WORK_GROUP_SIZE for OpenGL ES 3.1
According to OpenGL ES specification section 7.12,
GL_COMPUTE_WORK_GROUP_SIZE, is supported by the
glGetProgramiv function.

Signed-off-by: Marta Lofstedt <marta.lofstedt@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-08-26 08:25:07 +03:00
Marta Lofstedt c2a766880d mesa/es3.1: Enable getting MAX_COMPUTE_WORK_GROUP_ values for OpenGL ES 3.1
According to the OpenGL ES 3.1 specification chapter 17, the
MAX_COMPUTE_WORK_GROUP_COUNT and MAX_COMPUTE_WORK_GROUP_SIZE
is available for glGetIntegeri_v.

Signed-off-by: Marta Lofstedt <marta.lofstedt@linux.intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2015-08-26 08:25:07 +03:00
Dave Airlie 73e5adc4b2 mesa/formats: pass correct parameter to _mesa_is_format_compressed
commit 26c549e69d
Author: Nanley Chery <nanley.g.chery@intel.com>
Date:   Fri Jul 31 10:26:36 2015 -0700

    mesa/formats: remove compressed formats from matching function

caused a regression in my CTS testing, this looks like a clear
thinko.

Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
sSigned-off-by: Dave Airlie <airlied@redhat.com>
2015-08-26 14:13:27 +10:00
Roland Scheidegger 48e6404c04 gallium/auxiliary: optimize rgb9e5 helper some more
I used this as some testing ground for investigating some compiler
bits initially (e.g. lrint calls etc.), figured I could do much better
in the end just for fun...
This is mathematically equivalent, but uses some tricks to avoid
doubles and also replaces some float math with ints. Good for another
performance doubling or so. As a side note, some quick tests show that
llvm's loop vectorizer would be able to properly vectorize this version
(which it failed to do earlier due to doubles, producing a mess), giving
another 3 times performance increase with sse2 (more with sse4.1), but this
may not apply to mesa.
No piglit change.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2015-08-26 02:57:38 +02:00
Roland Scheidegger 941346a803 gallium/auxiliary: optimize rgb9e5 helper a bit
This code (lifted straight from the extension) was doing things the most
inefficient way you could think of.
This drops some of the more expensive float operations, in particular
- int-cast floors (pointless, values always positive)
- 2 raised to (signed) integers (replace with simple exponent manipulation),
  getting rid of a misguided comment in the process (implement with table...)
- float division (replace with mul of reverse of those exponents)
This is like 3 times faster (measured for float3_to_rgb9e5), though it depends
(e.g. llvm is clever enough to replace exp2 with ldexp whereas gcc is not,
division is not too bad on cpus with early-exit divs).
Note that keeping the double math for now (float x + 0.5), as the results may
otherwise differ.

Acked-by: Marek Olšák <marek.olsak@amd.com>
2015-08-26 02:57:37 +02:00
Dave Airlie c1452983b4 mesa/texgetimage: fix missing stencil check
GetTexImage can read to stencil8 but only from
a stencil or depthstencil textures.

This fixes a bunch of failures in CTS
GL33-CTS.gtf32.GL3Tests.packed_pixels

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Cc: "11.0" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-26 10:22:09 +10:00
Nanley Chery 1d2a844e7d mesa/teximage: Add GL error parameter to _mesa_target_can_be_compressed
Enables _mesa_target_can_be_compressed to return the appropriate GL error
depending on it's inputs. Use the parameter to return the appropriate GL error
for ETC2 formats on GLES3.

Suggested-by: Chad Versace <chad.versace@intel.com>
Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-25 15:53:46 -07:00
Nanley Chery 26c549e69d mesa/formats: remove compressed formats from matching function
All compressed formats return GL_FALSE and there isn't any evidence to
support that this behaviour would change. Remove all switch cases for
compressed formats.

v2. Since the exhaustive switch is removed, add a gtest to ensure
    all formats are handled.
v3. Ensure that GL_NO_ERROR is set before returning.
v4. Fix an arg to _mesa_uncompressed_format_to_type_and_comps();
    fix formatting and misc improvements (Chad).

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-25 15:45:17 -07:00
Nanley Chery 8e581747d2 mesa/formats: make format testing a gtest
We currently check that our format info table is sane during context
initialization in debug builds. Perform this check during
`make check` instead. This enables format testing in release builds
and removes the requirement of an exhuastive switch for
_mesa_uncompressed_format_to_type_and_comps().

v2. indentation and conditional inclusion fixes (Chad).
    allow tests to continue running if any format fails
    and display the failing format name.

Reviewed-by: Chad Versace <chad.versace@intel.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-25 15:45:13 -07:00
Kenneth Graunke 1bec29d04d gallium/ttn: Use nir_builder_insert() rather than poking at cf_list.
I intend to remove nir_builder::cf_node_list, so I can't have this code
poking at it directly.  The proper way is to set the insertion point and
then simply insert things there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-08-25 11:12:35 -07:00
Kenneth Graunke 78856194c1 prog_to_nir: Use nir_builder_insert() rather than poking at cf_list.
I intend to remove nir_builder::cf_node_list, so I can't have this code
poking at it directly.  The proper way is to set the insertion point and
then simply insert things there.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-08-25 11:12:35 -07:00
Kenneth Graunke 5f14c417c8 nir: Use nir_shader::stage rather than passing it around.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-08-25 11:12:35 -07:00
Kenneth Graunke d4d5b430a5 nir: Store gl_shader_stage in nir_shader.
This makes it easy for NIR passes to inspect what kind of shader they're
operating on.

Thanks to Michel Dänzer for helping me figure out where TGSI stores the
shader stage information.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2015-08-25 11:12:35 -07:00
Jason Ekstrand dfacae3a56 i965/fs: Combine assign_constant_locations and move_uniform_array_access_to_pull_constants
The comment above move_uniform_array_access_to_pull_constants was
completely bogus because it has nothing to do with lowering instructions.
Instead, it's assiging locations of pull constants.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand c999a58f50 nir/lower_io: Remove assign_var_locations_direct_first
This is no longer used so we might as well get rid of it.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand 259f7291de i965/fs: Rework uniform handling
Previously, we treated the entire UNIFORM file as if it had two elements:
One for direct things and one for indirect.  This is substantially
different from how the old visitor code handled it where each element was
effectively its own uniform.  This commit makes the NIR path more like the
old ir_visitor path where each uniform is separate.  This should allow us
to more easily make decisions about what to push.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand cfa056c6a5 i965/vec4_nir: Get rid of the uniform_driver_location tracking
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand ce5e9139aa nir/lower_io: Separate driver_location and base offset for uniforms
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand 0db8e87b4a nir/intrinsics: Add a second const index to load_uniform
In the i965 backend, we want to be able to "pull apart" the uniforms and
push some of them into the shader through a different path.  In order to do
this effectively, we need to know which variable is actually being referred
to by a given uniform load.  Previously, it was completely flattened by
nir_lower_io which made things difficult.  This adds more information to
the intrinsic to make this easier for us.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Kenneth Graunke 6c33d6bbf9 nir: Pass a type_size() function pointer into nir_lower_io().
Previously, there were four type_size() functions in play - the i965
compiler backend defined scalar and vec4 type_size() functions, and
nir_lower_io contained its own similar functions.

In fact, the i965 driver used nir_lower_io() and then looped over the
components using its own type_size - meaning both were in play.  The
two are /basically/ the same, but not exactly in obscure cases like
subroutines and images.

This patch removes nir_lower_io's functions, and instead makes the
driver supply a function pointer.  This gives the driver ultimate
flexibility in deciding how it wants to count things, reduces code
duplication, and improves consistency.

v2 (Jason Ekstrand):
 - One side-effect of passing in a function pointer is that nir_lower_io is
   now aware of and properly allocates space for image uniforms, allowing
   us to drop hacks in the backend

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
v2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Kenneth Graunke a23f82053d prog_to_nir: Don't allocate nir_variable with type vec4[0] for uniforms.
If there are no parameters, we don't need to create a nir_variable to
hold them...and allocating an array of length 0 is pretty bogus.

Should avoid i965 backend assertions in future patches Jason and I are
working on.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-25 10:18:27 -07:00
Kenneth Graunke 640c472fd0 i965: Move type_size() methods out of visitor classes.
I want to use C function pointers to these, and they don't use anything
in the visitor classes anyway.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-25 10:18:27 -07:00
Jason Ekstrand c56899f41a i965: Make setup_vec4_uniform_value and _image_uniform_values take an offset
This way they don't implicitly increment the uniforms variable and don't
have to be called in-sequence during uniform setup.

Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Jason Ekstrand 8d8b8f5854 i965: Rename setup_vector_uniform_values to setup_vec4_uniform_value
The new name more accurately represents what it does: Set up a single vec4
uniform value.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-25 10:18:27 -07:00
Rob Clark 0ab29751b6 freedreno/ir3: fix compile break after splitting out nir_control_flow.h
The commit:

  commit b49371b8ed
  Author:     Connor Abbott <cwabbott0@gmail.com>
  AuthorDate: Tue Jul 21 19:54:18 2015 -0700

      nir: move control flow modification to its own file

split out some control flow related APIs into a separate header, but did
not update drivers.

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-08-25 08:17:30 -04:00
Rob Clark 8b2d0bb844 freedreno/ir3: fix compile break after fxn->start_block removal
The commit:

  commit 8e0d4ef341
  Author:     Kenneth Graunke <kenneth@whitecape.org>
  AuthorDate: Thu Aug 6 18:18:40 2015 -0700

      nir: Delete the nir_function_impl::start_block field.

removed the start_block field without fixing up drivers..

Signed-off-by: Rob Clark <robclark@freedesktop.org>
2015-08-25 08:13:04 -04:00
Dave Airlie 529acab22a mesa: enable texture stencil8 for multisample
This fixes GL45-CTS.gtf44.GL31Tests.texture_stencil8.texture_stencil8_gl44
from the ogl conform suite.

Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Cc: 10.6 11.0 <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-08-25 11:06:58 +10:00
Brian Paul e089ca26e1 mesa: make _mesa_bind_texture_unit() static
It's only called from the file it's defined in.

Reviewed-by: Timothy Arceri <t_arceri@yahoo.com.au>
2015-08-24 18:23:19 -06:00
Nanley Chery 8f378d1083 mesa/formats: store whether or not a format is sRGB in gl_format_info
v2: remove extra newline.
v3: use bool instead of GLboolean.

Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
2015-08-24 16:08:01 -07:00
Kenneth Graunke 4f2cdd8497 nir: Use !block_ends_in_jump() in a few places rather than open-coding.
Connor introduced this helper recently; we should use it here too.

I had to move the function earlier in the file for it to be available.

Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
2015-08-24 15:10:55 -07:00
Connor Abbott d7971b41ce nir/cf: reimplement nir_cf_node_remove() using the new API
This gives us some testing of it. Also, the old nir_cf_node_remove()
wasn't handling phi nodes correctly and was calling cleanup_cf_node()
too late.

Signed-off-by: Connor Abbott <connor.w.abbott@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2015-08-24 13:31:42 -07:00