Commit Graph

92841 Commits

Author SHA1 Message Date
Dave Airlie c2464271a0 radv: introduce perf test env var and allow to enable chaining
We have some features that seem to slow things down or cause other
possible undesireable side effects, but it would be nice to test
games etc with them easily.

I forsee multisample DCC and maybe some shader opt changes using this.

For now use it for batch chaining.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2017-06-09 02:15:25 +01:00
Timothy Arceri d0a26edc25 mesa: add KHR_no_error support to glDrawRangeElements*()
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-09 09:58:07 +10:00
Timothy Arceri 87cb44d9b0 mesa: rework _ae_invalidate_state() so that it just sets a dirty flag
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri b57bc7473b mesa: remove redundant _ae_invalidate_state() call
The FLUSH_VERTICES(ctx, _NEW_ARRAY) above this will already cause
this to be called.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri bc70bad59b mesa: inline vbo_exec_invalidate_state() and call from mesa core
Rather than calling it indirectly in each driver.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri 99987fe92e mesa: rework vbo_exec_init()
Here we make some assumptions about the AEcontext and set the
recalculate bools directly.

Some formating fixes are also made while we are here.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri f77740f14b mesa: stop passing state bitfield to UpdateState()
The code comment which seems to have been added in cab974cf6c
(from year 2000) says:

   "Set ctx->NewState to zero to avoid recursion if
   Driver.UpdateState() has to call FLUSH_VERTICES().  (fixed?)"

As far as I can tell nothing in any of the UpdateState() calls
should cause it to be called recursively.

V2: add a wrapper around the osmesa update function so it can still
    be used internally.

Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri f627ac6e35 st/mesa: add st_invalidate_buffers() helper
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-09 09:13:46 +10:00
Timothy Arceri df27aba422 r200/radeon: stop calling _ae_invalidate_state() directly
It is already called via _vbo_InvalidateState().

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Ian Romanick <ian.d.romanick@intel.com>
2017-06-09 09:13:46 +10:00
Tim Rowley 0b80b02502 swr: relax c++ requirement from c++14 to c++11
Remove c++14 generic lambda to keep compiler requirement at c++11.

No regressions on piglit or vtk test suites.

Tested-by: Chuck Atkins <chuck.atkins@kitware.com>
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>

CC: mesa-stable@lists.freedesktop.org
2017-06-08 18:07:52 -05:00
Juan A. Suarez Romero a625d58ee1 radeonsi: call LLVMAddEarlyCSEMemSSAPass only for LLVM >= 4.0
LLVMAddEarlyCSEMemSSAPass() is defined in LLVM 4.0.

Fixes: 257b538 ("radeonsi: do EarlyCSEMemSSA LLVM pass)

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2017-06-08 23:32:32 +02:00
Marek Olšák 6940361796 gallium/radeon: don't allocate HTILE in a separate buffer
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Marek Olšák c6451b1209 radeonsi: rename depth decompress functions
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Marek Olšák d8a577d96e radeonsi: rename shader resource decompress masks to their true meaning
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Marek Olšák da26de5ff7 radeonsi: rename is_compressed_colortex -> color_needs_decompression
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Marek Olšák 391673af7a radeonsi: disable the patch ID workaround on SI when the patch ID isn't used (v2)
The workaround causes a massive performance decrease on 1-SE parts.
(Cape Verde, Hainan, Oland)

The performance regression is already part of 17.0 and 17.1.

v2: check tess_uses_prim_id

Cc: 17.0 17.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Marek Olšák 4b8d0c2b1d radeonsi: don't update dependent states if it has no effect (v2)
This and the previous clip_regs commit decrease IB sizes and the number of
si_update_shaders invocations as follows:

                 IB size   si_update_shaders calls
Borderlands 2      -10%            -27%
Deus Ex: MD         -5%            -11%
Talos Principle     -8%            -30%

v2: always dirty cb_render_state in set_framebuffer_state

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 23:29:07 +02:00
Varad Gautam f804e0672e i965: Add format/modifier advertising
v2: Rebase and reuse tiling/modifier map. (Daniel Stone)
v3: bump DRIimageExtension to version 15, fill external_only array.
v4: Y-tiling works since gen 6

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Varad Gautam c303772e5b i965: Support dmabuf import with modifiers
Add support for createImageFromDmaBufs2, adding a modifier to the
original, and allow importing CCS resources with auxiliary data from
dmabufs.

v2: avoid DRIimageExtension version bump, pass single modifier to
    createImageFromDmaBufs2.

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Daniel Stone f58e6358bf i965: Improve same-buffer restriction for imports
Intel hardware requires that all planes of an image come from the same
buffer, which is currently implemented by testing that all FDs are
numerically the same.

However, when going through a winsys (e.g.) or anything which transits
FDs individually, the FDs may be different even if the underlying buffer
is the same.

Instead of checking the FDs for equality, we must check if they actually
point to the same buffer (Jason).

Reviewed-by: Varad Gautam <varad.gautam@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Ben Widawsky 37cdcaf386 i965: Allocate tile aligned height
This patch shouldn't actually do anything because the libdrm function
should already do this alignment. However, it preps us for a future
patch where we add in the CCS AUX size, and in the process it serves as
a good place to find bisectable issues if libdrm or kernel does
something incorrectly.

v2: Do proper alignment for X tiling, and make sure non-tiled case is
handled (Jason)
v3: Rebase (Daniel)

Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Daniel Stone 78703881ff i965: Move fallback size assignment out of bufmgr
The bufmgr took a mandatory size argument, which would only be used if
the kernel size query failed, i.e. an older kernel. It didn't actually
check that the BO size was sufficient for use.

Pull the check out of the bufmgr, and actually check that the BO is
sufficiently-sized for our import one level up. This also resolves a
chicken/egg we have when importing bufers without explicit modifiers,
namely that we need the tiling mode to calculate the size, but we need
the BO imported to query the tiling mode.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Daniel Stone 6b18d4aaec i965: Invert image modifier/tiling inference
When allocating images, we record a tiling mode and then work backwards
to infer the modifier. Unfortunately this is the wrong way around, since
it is a one:many mapping (e.g. TILING_Y can be plain Y-tiling, or
Y-tiling with CCS).

Invert the mapping, so we record a modifier first and then map this to a
tiling mode.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Daniel Stone 11e549ae3f egl/dri2: Avoid sign extension when building modifier
Since the EGL attributes are signed integers, a straight OR would
also perform sign extension,

Fixes: 6f10e7c37a ("egl/dri2: Create EGLImages with dmabuf modifiers")
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2017-06-08 22:27:30 +01:00
Vinson Lee 142536a0e3 i915g: Add blitter_context argument.
Fix build error.

  CC       i915_surface.lo
i915_surface.c:108:63: error: too few arguments to function call, expected 4, have 3
   util_blitter_default_src_texture(&src_templ, src, src_level);
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~                           ^
../../../../src/gallium/auxiliary/util/u_blitter.h:271:1: note: 'util_blitter_default_src_texture' declared here
void util_blitter_default_src_texture(struct blitter_context *blitter,
^

Fixes: a893c91697 ("gallium/u_blitter: use 2D_ARRAY for cubemap blits if possible")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101340
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-06-08 13:47:39 -07:00
Lucas Stach 978e6876f1 etnaviv: flush resource when binding as sampler view
As TS is also allowed on sampler resources, we need to make sure to resolve
to self when binding the resource as a texture, to avoid stale content
being sampled.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
2017-06-08 18:29:36 +02:00
Lucas Stach f25390afa4 etnaviv: don't flush resource to self without TS
A resolve to self is only necessary if the resource is fast cleared, so
there is never a need to do so if there is no TS allocated.

Signed-off-by: Lucas Stach <dev@lynxeye.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Lucas Stach 0f888ad4be etnaviv: upgrade DISCARD_RANGE to DISCARD_WHOLE_RESOURCE if possible
Stolen from VC4. As we don't do any fancy reallocation tricks yet, it's
possible to upgrade also coherent mappings and shared resources.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Lucas Stach d4e6de9e38 etnaviv: simplify transfer tiling handling
There is no need to special case compressed resources, as they are already
marked as linear on allocation. With that out of the way, there is room to
cut down on the number of if clauses used.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Lucas Stach 6e628ee3f3 etnaviv: don't read back resource if transfer discards contents
Reduces bandwidth usage of transfers which discard the buffer contents,
as well as skipping unnecessary command stream flushes and CPU/GPU
synchronization.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Lucas Stach c3b2c7a75f etnaviv: honor PIPE_TRANSFER_UNSYNCHRONIZED flag
This gets rid of quite a bit of CPU/GPU sync on frequent vertex buffer
uploads and I haven't seen any of the issues mentioned in the comment,
so this one seems stale.

Ignore the flag if there exists a temporary resource, as those ones are
never busy.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Lucas Stach a276c32a08 etnaviv: slim down resource waiting
cpu_prep() already does all the required waiting, so the only thing that
needs to be done is flushing the commandstream, if a GPU write is pending.

Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Wladimir J. van der Laan <laanwj@gmail.com>
2017-06-08 18:29:36 +02:00
Rob Herring ada3c3aa3d glsl: Fix gl_shader_stage enum unsigned comparison
Replace -1 with MESA_SHADER_NONE enum value to fix sign related warning:

external/mesa3d/src/compiler/glsl/link_varyings.cpp:1415:25: warning: comparison of constant -1 with expression of type 'gl_shader_stage' is always true [-Wtautological-constant-out-of-range-compare]
        (consumer_stage != -1 && consumer_stage != MESA_SHADER_FRAGMENT))) {
         ~~~~~~~~~~~~~~ ^  ~~

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-06-08 07:26:04 -05:00
Rob Herring 6150ea794b Android: vulkan: fix build error due to extra )
Commit 621b3410f5 ("util/vulkan: Move Vulkan utilities to
src/vulkan/util") broke the Android build with the following error:

build/core/binary.mk:1427: error: external/mesa3d/src/vulkan/Android.mk: libmesa_vulkan_util: Unused source files: util/vk_util.h).

Fixes: 621b3410f5 ("util/vulkan: Move Vulkan utilities to src/vulkan/util")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Cc: Alex Smith <asmith@feralinteractive.com>
Signed-off-by: Rob Herring <robh@kernel.org>
2017-06-08 07:26:04 -05:00
Iago Toral Quiroga ce53e8e61b Fix glcpp test expectations
With commit f7741985be we have changed some preprocessor
error messages and warnings. Adapt related glcpp tests
expectations accordingly.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101336
Tested-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2017-06-08 09:46:36 +02:00
Vlad Golovkin f4df2a196e util: make set's deleted_key_value declaration consistent with hash table one
This also silences following clang warnings:
no previous extern declaration for non-static variable 'deleted_key' [-Werror,-Wmissing-variable-declarations]
const void *deleted_key = &deleted_key_value;
            ^
no previous extern declaration for non-static variable 'deleted_key_value'
      [-Werror,-Wmissing-variable-declarations]
uint32_t deleted_key_value;
         ^

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-06-08 09:26:44 +02:00
Jason Ekstrand f1ba51b940 i965: Delete intel_resolve_map
Now that we've moved over to the new array mechanism, it's no longer
needed.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand 641405f797 i965: Use the new tracking mechanism for HiZ
This is similar to the previous commit only for HiZ.  For HiZ, apart
from everything looking different, there is really only one functional
change:  We now track the ISL_AUX_STATE_COMPRESSED_NO_CLEAR state.
Previously, if you rendered to a resolved slice of the miptree and then
did a fast-clear with a different clear color, that slice would get
resolved even though it hadn't been fast-cleared.  Now that we can track
COMPRESSED_NO_CLEAR, we know that it doesn't have any blocks in the
"clear" state so we can skip the resolve.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand e6c69264ed i965/miptree: Make level_has_hiz take a const miptree
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand b2c6290b01 i965: Wholesale replace the color resolve tracking code
This commit reworks the resolve tracking for CCS and MCS to use the new
isl_aux_state enum.  This should provide much more accurate and easy to
reason about tracking.  In order to understand, for instance, the
intel_miptree_prepare_ccs_access function, one only has to go look at
the giant comment for the isl_aux_state enum and follow the arrows.
Unfortunately, there's no good way to split this up without making a
real mess so there are a bunch of changes in here:

 1) We now do partial resolves.  I really have no idea how this ever
    worked before.  So far as I can tell, the only time the old code
    ever did a partial resolve was when it was using CCS_D where a
    partial resolve and a full resolve are the same thing.

 2) We are now tracking 4 states instead of 3 for CCS_E.  In particular,
    we distinguish between compressed with clear and compressed without
    clear.  The end result is that you will never get two partial
    resolves in a row.

 3) The texture view rules are now more correct.  Previously, we would
    only bail if compression was not supported by the destination
    format.  However, this is not actually correct.  Not all format
    pairs are supported for texture views with CCS even if both support
    CCS individually.  Fortunately, ISL has a helper for this.

 4) We are no longer using intel_resolve_map for tracking aux state but
    are instead using a simple array of enum isl_aux_state indexed by
    level and layer.  This is because, now that we're tracking 4
    different states, it's no longer clear which should be the "default"
    and array lookups are faster than linked list searches.

 5) The new code is very assert-happy.  Incorrect transitions will now
    get caught by assertions rather than by rendering corruption.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand 46fd924899 i965: Delete most of the old resolve interface
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand f296c22989 i965: Use the new get/set_aux_state functions for color clears
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand 38563e95d5 i965: Move blorp to the new resolve functions
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Acked-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand 554f7d6d02 i965: Move depth to the new resolve functions
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand 170e4b366a i965: Move images to the new resolve functions
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand 8cb3b4a586 i965: Move framebuffer fetch to the new resolve functions
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand 79df134d56 i965: Remove an unneeded render_cache_set_check_flush
This is only needed to fix rendering corruptions caused by not flushing
after doing a resolve operation.  The resolve now does all the needed
flushing so this is unnecessary.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand 49e4d8cce2 i965: Move color rendering to the new resolve functions
This also removes an unneeded brw_render_cache_set_check_flush() call.
We were calling it in the case where the surface got resolved to satisfy
the flushing requirements around resolves.  However, blorp now does this
itself, so the extra is just redundant.

Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand c0f5225264 i965: Move texturing to the new resolve functions
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00
Jason Ekstrand 421d713eec i965: Use the new resolve function for several simple cases
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
2017-06-07 22:18:53 -07:00