Commit Graph

99228 Commits

Author SHA1 Message Date
Dave Airlie 05f5282d63 r600/sb: add tess/compute initial state registers.
This stops them being optimised out.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:35:12 +00:00
Dave Airlie 68b976bd91 r600/sb: fix a bug emitting ar load from a constant.
Some tess shaders were doing MOVA_INT _, c0.x on cayman, and then
hitting an assert in sb_bc_finalize.cpp:translate_kcache.

This makes sure the toplevel kcache tracker gets updated,
and the clause gets fixed up.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:34:46 +00:00
Dave Airlie 7efcafce7c r600/shader: only emit add instruction if param has a value.
Just saves a pointless a = a + 0;

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:34:43 +00:00
Dave Airlie 2bd01adf14 r600: emit 0 gds_op for tf write.
This field is ignored for tf writes so should be 0.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 03:34:36 +00:00
Dave Airlie 9041730d1c r600: add support for ARB_shader_clock.
Reviewed-by: Gert Wollny <gw.fossedev@gmail.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 13:25:59 +10:00
Dave Airlie 6785034a70 radv/ws: get rid of useless return value
This also used boolean, so nice to kill that.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2018-01-18 01:57:53 +00:00
Bas Nieuwenhuizen 2ce11ac11f radv: Initialize DCC on transition from preinitialized.
Looks like the decompress does not handle invalid encodings well,
which happens with random memory. Of course apps should not use it
with random memory, but they are allowed to ....

Fixes: 44fcf58744 "radv: Disable DCC for GENERAL layout and compute transfer dest."
Reviewed-by: Dave Airlie <airlied@redhat.com>
2018-01-18 01:57:52 +01:00
Timothy Arceri e2b9296146 ac: fix buffer overflow bug in 64bit SSBO loads
Fixes: 441ee1e65b "radv/ac: Implement Float64 SSBO loads"

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-01-18 10:26:58 +11:00
Timothy Arceri 409e15f26f ac: fix nir_intrinsic_get_buffer_size for radeonsi
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-18 10:25:20 +11:00
Kenneth Graunke d139b5e4cc i965: Pass brw_growing_bo to grow_buffer().
Cleaner.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-17 13:13:26 -08:00
Kenneth Graunke 81ca8e69e3 i965: Make a helper for recreating growing buffers.
Now that we have two of these, we're duplicating a bunch of this logic.
The next commit will add more logic, which would make the duplication
seem worse.

This ends up setting EXEC_OBJECT_CAPTURE on the batch, which isn't
necessary (it's already captured), but it should be harmless.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-17 13:13:26 -08:00
Kenneth Graunke 02c1c25b1a i965: Replace cpu_map pointers with a "use_shadow_copy" boolean.
Having a boolean for "we're using malloc'd shadow copies for all
buffers" is cleaner than having a cpu_map pointer for each.  It was
okay when we had one buffer, but this is more obvious.

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
2018-01-17 13:13:26 -08:00
Francisco Jerez 11674dad8a intel/fs: Optimize and simplify the copy propagation dataflow logic.
Previously the dataflow propagation algorithm would calculate the ACP
live-in and -out sets in a two-pass fixed-point algorithm.  The first
pass would update the live-out sets of all basic blocks of the program
based on their live-in sets, while the second pass would update the
live-in sets based on the live-out sets.  This is incredibly
inefficient in the typical case where the CFG of the program is
approximately acyclic, because it can take up to 2*n passes for an ACP
entry introduced at the top of the program to reach the bottom (where
n is the number of basic blocks in the program), until which point the
algorithm won't be able to reach a fixed point.

The same effect can be achieved in a single pass by computing the
live-in and -out sets in lock-step, because that makes sure that
processing of any basic block will pick up the updated live-out sets
of the lexically preceding blocks.  This gives the dataflow
propagation algorithm effectively O(n) run-time instead of O(n^2) in
the acyclic case.

The time spent in dataflow propagation is reduced by 30x in the
GLES31.functional.ssbo.layout.random.all_shared_buffer.5 dEQP
test-case on my CHV system (the improvement is likely to be of the
same order of magnitude on other platforms).  This more than reverses
an apparent run-time regression in this test-case from my previous
copy-propagation undefined-value handling patch, which was ultimately
caused by the additional work introduced in that commit to account for
undefined values being multiplied by a huge quadratic factor.

According to Chad this test was failing on CHV due to a 30s time-out
imposed by the Android CTS (this was the case regardless of my
undefined-value handling patch, even though my patch substantially
exacerbated the issue).  On my CHV system this patch reduces the
overall run-time of the test by approximately 12x, getting us to
around 13s, well below the time-out.

v2: Initialize live-out set to the universal set to avoid rather
    pessimistic dataflow estimation in shaders with cycles (Addresses
    performance regression reported by Eero in GpuTest Piano).
    Performance numbers given above still apply.  No shader-db changes
    with respect to master.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104271
Reported-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:56:08 -08:00
Marek Olšák 63b231309e gallium: remove PIPE_CAP_USER_CONSTANT_BUFFERS
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:18:00 +01:00
Marek Olšák 85bbcdda34 st/mesa: assume that user constant buffers are always supported
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:17:59 +01:00
Marek Olšák 5981a5226e nine: assume that user constant buffers are always supported
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:17:59 +01:00
Marek Olšák e871abe452 gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:17:59 +01:00
Marek Olšák e411d2572b st/mesa: expose ARB_sync unconditionally
All drivers support it.

Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:17:59 +01:00
Marek Olšák 3778a0a533 gallium: remove PIPE_CAP_TWO_SIDED_STENCIL
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-01-17 20:17:59 +01:00
Brian Paul f631a6ae8c glsl: remove unneeded extern "C" {} bracketing around Mesa includes
The two headers already have the right extern "C" annotations.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul 7421e34dd6 mesa: move gl_external_samplers() to program.[ch]
The function is only called from a couple places.  It doesn't make
sense to have it in mtypes.h

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul 6dc8896726 st/mesa: include util/bitscan.h in st_glsl_to_tgsi_temprename.cpp
And use "" instead of <> for including Mesa headers, as we do elsewhere.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul 741d423478 glsl: include util/bitscan.h in serialize.cpp
Instead of relying on indirect inclusion of the header.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul 2f14146200 util: include string.h in u_dynarray.h
To get memset() prototype.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul 02c0734adc mesa: remove unneeded #includes of main/compiler.h
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul 7845397183 st/mesa: remove unneeded #includes of main/compiler.h
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul bafa33befd st/mesa: include main/compiler.h in st_cb_queryobj.c
To get CPU_TO_LE32() macro.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul 7de0262f8f mesa: include util/macros.h in format_fallback.c
To get definition of unreachable() macro.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul 484ac243f6 mesa: include compiler.h in disk_cache.c
Instead of indirect inclusion to get CPU_TO_LE32() macro.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-01-17 11:17:56 -07:00
Brian Paul ad00a78993 mesa/program: change validate_inputs() local var 'inputs' to GLbitfield64
Both state->prog->info.inputs_read and state->InputsBound are GLbitfield64
so it seems that the OR of those values should be of the same type.
I'm not sure this fixes any actual issues though.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
2018-01-17 11:17:56 -07:00
Brian Paul d8306de4ac vbo: reindent vbo_attrib.h to use 3 spaces
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul ef01d911ee vbo: whitespace, formatting fixes in vbo_exec_api.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul 18f4241b89 vbo: add assertions, comments in vbo_exec_api.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul 5e8962b58f vbo: whitespace, formatting fixes in vbo_exec_draw.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul f5140c1a6b vbo: use inputs_read var to simplify code
v2: add some const qualifiers, per Ian.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul e6e78d98a2 vbo: whitespace, formatting fixes in vbo_split_copy.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul 380165d399 vbo: use a new local 'array' variable in bind_vertex_list() loop
Make the code a bit more concise.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul 37863c38f3 vbo: remove unneeded #includes in vbo_context.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul 76a08eeeec vbo: whitespace, formatting fixes in vbo_context.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul 95dec9097f vbo: change vbo_context attribute map arrays to GLubyte
The values will never be larger than VBO_ATTRIB_MAX (currently 44).

v2: add STATIC_ASSERT to be sure VBO_ATTRIB_MAX can fit in ubyte,
per Emil.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul 5d78440d58 vbo: lift common code out of switch cases
Both switch cases began with the same code.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
2018-01-17 11:17:56 -07:00
Brian Paul 8e4efdc895 vbo: optimize some display list drawing (v2)
The vbo_save_vertex_list structure records one or more glBegin/End
primitives which all have the same vertex format.

To draw these primitives, we setup the vertex array state, then
issue the drawing command.  Before, the 'start' vertex was typically
zero and we used the vertex array pointer to indicate where the
vertex data starts.

This patch checks if the vertex buffer offset is an exact multiple of
the vertex size.  If so, that means we can use zero-based vertex array
pointers and use the draw's start value to indicate where the vertex
data starts.

This means a series of display list drawing commands may have
identical vertex array state.  This will get filtered out by the
Gallium CSO module so we can issue a tight series of drawing commands
without state changes to the device.

Note that this also works for a series of glCallList commands (not
just one list that contains multiple glBegin/End pairs).

No Piglit or conform changes.

v2: minor fixes suggested by Ian.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul 4edc8fdcdc vbo: rewrite some code in playback_copy_to_current()
I think this is a little easier to understand.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul ad3b162b81 vbo: add some comments in vbo_save_api.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul ee86c34664 vbo: rename some functions in vbo_save_api.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul efb5ce1c5e vbo: rename some functions in vbo_save_draw.c
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul 2064c4e814 vbo: add comment that vbo_save_vertex_list::buffer_offset is in bytes
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul 8b06acf642 vbo: minor code simplification in _save_compile_vertex_list()
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul 4acc8a0de3 vbo: rename prim to prims
Using a plural name makes it easier to see that this is an array and
not a pointer to a single object.

Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00
Brian Paul 841d1839e2 vbo: removed unused ctx parameter for alloc_prim_store()
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2018-01-17 11:17:56 -07:00