Commit Graph

111426 Commits

Author SHA1 Message Date
Kenneth Graunke 7acc88a47c iris: Move some field setting after we drop the lock.
It's not much, but we may as well hold the lock for a bit less time.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-05-29 19:42:04 -07:00
Kenneth Graunke 76c5a19668 iris: Move cached BO allocation into a helper function.
There's enough going on here to warrant a helper.  This also simplifies
the control flow and eliminates the last non-error-case goto.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-05-29 19:41:52 -07:00
Kenneth Graunke cea6671395 iris: Fall back to fresh allocations of mapping for zero-memset fails.
It is unlikely that we would fail to map a cached BO in order to zero
its contents.  When we did, we would free the first BO in the cache and
try again with the second.  It's possible that this next BO already had
a map setup, in which case we'd succeed.  But if it didn't, we'd likely
fail again in the same manner.

There's not much point in optimizing this case (and frankly, if we're
out of CPU-side VMA we should probably dump the cache entirely)...so
instead, just fall back to allocating a fresh BO from the kernel which
will already be zeroed so we don't have to try and map it.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-05-29 19:41:50 -07:00
Kenneth Graunke 042f8514e6 iris: Move fresh BO allocation into a helper function.
There's enough going on here to warrant a helper.  More cleaning coming.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-05-29 19:41:22 -07:00
Kenneth Graunke 06421e5be7 iris: Do SET_TILING at a single point rather than in two places.
Both the from-cache and fresh-from-GEM cases were calling SET_TILING.
In the cached case, we would retry the allocation on failure, pitching
one BO from the cache each time.  This is silly, because the only time
it should fail is if the tiling or stride parameters are unacceptable,
which has nothing to do with the particular BO in question.  So there's
no point in retrying - we should simply fail the allocation.

This patch moves both calls to bo_set_tiling_internal() below the
cache/fresh split, so we have it at a single point in time instead
of two.

To preserve the ordering between SET_TILING and SET_DOMAIN, we move
that below as well.  (I am unsure if the order matters.)

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-05-29 19:41:08 -07:00
Kenneth Graunke 43d835cb0f iris: Use the BO cache even for coherent buffers on non-LLC.
We mark snooped BOs as non-reusable, so we never return them to the
cache.  This means that we'd need to call I915_GEM_SET_CACHING to make
any BO we find in the cache snooped.  But then again, any BO we freshly
allocate from the kernel will also be non-snooped, so it has the same
issue.  There's really no reason to skip the cache - we may as well use
it to avoid the I915_GEM_CREATE overhead.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-05-29 19:40:18 -07:00
Kenneth Graunke 78003014d0 iris: Fix locking around vma_alloc in iris_bo_create_userptr
util_vma needs to be protected by a lock.  All other callers of
vma_alloc and vma_free appear to be holding a lock already.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-05-29 19:40:16 -07:00
Kenneth Graunke 5fc11fd988 iris: Fix lock/unlock mismatch for non-LLC coherent BO allocation.
The goto jumped over the mtx_lock, but proceeded to hit the mtx_unlock.
We can simply set the bucket to NULL and it will skip the cache without
goto, and without messing up locking.

Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-05-29 19:40:15 -07:00
Marek Olšák 2285b93032 radeonsi: fix timestamp queries for compute-only contexts
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Tested-by: Jan Vesely <jan.vesely@rutgers.edu>
2019-05-29 21:13:35 -04:00
Marek Olšák b5697c311b Change a few frequented uses of DEBUG to !NDEBUG
debugoptimized builds don't define NDEBUG, but they also don't define
DEBUG. We want to enable cheap debug code for these builds.
I only chose those occurences that I care about.

Reviewed-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2019-05-29 21:13:35 -04:00
Kenneth Graunke 0f1b68ebee iris: Re-emit Surface State Base Address when context is lost.
When we hit a GPU hang, we failed to reset Surface State Base Address
right away, and would keep hanging until we filled up the binder.  Then
we'd finally get it right after a lot of repeated stumbles.  Update it
right away so we hopefully hang fewer times before succeeding.
2019-05-29 16:35:02 -07:00
Jason Ekstrand e459d6d6df iris: Enable nir_opt_large_constants
Shader-db results on Kaby Lake:

    total instructions in shared programs: 15306230 -> 15304726 (<.01%)
    instructions in affected programs: 4570 -> 3066 (-32.91%)
    helped: 16
    HURT: 0

    total cycles in shared programs: 361703436 -> 361680041 (<.01%)
    cycles in affected programs: 129388 -> 105993 (-18.08%)
    helped: 16
    HURT: 0

    LOST:   0
    GAINED: 2

The helped programs were in XCom 2, Deus Ex: Mankind Divided, and Kerbal
Space Program

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-29 21:09:16 +00:00
Jason Ekstrand 9dc57eebd5 iris: Don't assume UBO indices are constant
It will be true for the constant/system value buffer because they use a
constant zero but it's not true in general.  If we ever got here when
the source wasn't constant, nir_src_as_uint would assert.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: mesa-stable@lists.freedesktop.org
2019-05-29 21:09:16 +00:00
Jason Ekstrand 744f93f5c1 iris: Move upload_ubo_ssbo_surf_state to iris_program.c
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-29 21:09:16 +00:00
Brian Paul e584fd894e nir: silence three compiler warnings seen with MinGW
Silence two unused var warnings.  And init elem_size, elem_align to
zero to silence "maybe uninitialized" warnings.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-05-29 13:59:24 -06:00
Brian Paul c71ca65405 svga: clamp max_const_buffers to SVGA_MAX_CONST_BUFS
In case the device reports 15 (or more) buffers.

Reviewed-by: Charmaine Lee <charmainel@vmware.com>
2019-05-29 13:59:23 -06:00
Kenneth Graunke 6892d2b94a iris: Clone before calling nir_strip and serializing
This is non-destructive and leaves the debugging information in place.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-05-29 18:16:32 +00:00
Kenneth Graunke e1409aead5 iris: Only store the SHA1 of the NIR in iris_uncompiled_shader
Jason pointed out that we don't need to keep an entire copy of the
serialized NIR around, we just need the SHA1.  This does change our
disk cache key to be taking a SHA1 of a SHA1, which is a bit odd,
but should work out and be faster and use less memory.

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-05-29 18:16:32 +00:00
Caio Marcelo de Oliveira Filho e45bf01940 spirv: Change spirv_to_nir() to return a nir_shader
spirv_to_nir() returned the nir_function corresponding to the
entrypoint, as a way to identify it.  There's now a bool is_entrypoint
in nir_function and also a helper function to get the entry_point from
a nir_shader.

The return type reflects better what the function name suggests.  It
also helps drivers avoid the mistake of reusing internal shader
references after running NIR_PASS on it.  When using NIR_TEST_CLONE or
NIR_TEST_SERIALIZE, those would be invalidated right in the first pass
executed.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-29 10:34:35 -07:00
Caio Marcelo de Oliveira Filho a3bfdacb6c radv: Don't re-use entry_point pointer from spirv_to_nir
Replace its uses with checking for is_entrypoint and calling
nir_shader_get_entrypoint().

This is a preparation to change spirv_to_nir() return type.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-29 10:34:35 -07:00
Caio Marcelo de Oliveira Filho ee59bac9f4 glspirv: Don't re-use entry_point pointer from spirv_to_nir
Replace its use with checking for is_entrypoint.

This is a preparation to change spirv_to_nir() return type.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-29 10:34:30 -07:00
Caio Marcelo de Oliveira Filho c92d002982 turnip: Don't re-use entry_point pointer from spirv_to_nir
Replace its uses with nir_shader_get_entrypoint(), and change the
helper function to return nir_shader *.

This is a preparation to change spirv_to_nir() return type.

Reviewed-by: Rob Clark <robdclark@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-29 10:26:22 -07:00
Chia-I Wu 0a0be7aee0 virgl: fix readback with pending transfers
When readback is true, and there are pending writes in the transfer
queue, we should flush to avoid reading back outdated data.  This
fixes piglit arb_copy_buffer/dlist and a subtest of
arb_copy_buffer/data-sync.

Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Alexandros Frantzis <alexandros.frantzis@collabora.com>
2019-05-29 16:47:04 +00:00
Caio Marcelo de Oliveira Filho 8bdf5a008b nir: Allow derefs to be used as phi sources
It is possible and valid for a pointer to be selected based on a
conditional before used, and depending on the mode, those cases will
result in a phi with derefs as sources.

To achieve this, we don't rematerialize derefs that are used by phis.
As a consequence, when converting from SSA to regs, we may have phis
that come from different blocks and are used by phis.  We now convert
those to regs too.

Validation was added to ensure only derefs of certain modes can be
used as phi sources.  No extra validation is needed for the presence
of cast, any instruction that uses derefs will validate the
deref-chain is complete (ending in a cast or a var).

Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-05-29 08:19:15 -07:00
Connor Abbott ee2a92bcde radeonsi: Fix editorconfig
At least on vim, indenting doesn't work without this. Copied from
src/amd/vulkan.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-29 15:55:40 +02:00
Erik Faye-Lund 551b61528f mesa/main: clean up extension-check for GL_SAMPLE_MASK
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-05-29 10:54:09 +02:00
Erik Faye-Lund 426e896515 mesa/main: clean up extension-check for GL_SAMPLE_SHADING
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-05-29 10:54:09 +02:00
Erik Faye-Lund b9e9d701dc mesa/main: correct extension-checks for GL_PRIMITIVE_RESTART_FIXED_INDEX
This shouldn't be allowed in GLES 1/2.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-05-29 10:54:09 +02:00
Erik Faye-Lund 34ade0dc7c mesa/main: correct extension-checks for GL_BLEND_ADVANCED_COHERENT_KHR
KHR_blend_equation_advanced_coherent isn't exposed on OpenGL ES 1.x, so
we shouldn't allow its enums there either.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-05-29 10:54:09 +02:00
Erik Faye-Lund c0dabc6192 mesa/main: correct extension-checks for GL_FRAMEBUFFER_SRGB
This enum shouldn't be allowed on OpenGL ES 1.x, so let's instead
use the extenion-helpers, and check for desktop and gles extensions
separately.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-05-29 10:54:09 +02:00
Erik Faye-Lund a33ff7876f mesa/main: correct extension-checks for MESA_tile_raster_order
This extension isn't enabled for GLES 1.x, so we shouldn't allow the
state there. Let's use the extension-helpers instead of CHECK_EXTENSION
for this.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-05-29 10:54:09 +02:00
Erik Faye-Lund bf91d6ae4a mesa/main: make the CONSERVATIVE_RASTERIZATION_NV checks consistent
This just makes the logic of the checks for this enum the same for
gl{Enable,Disable} and for glIsEnabled. They are already functionally
the same, so this is just a minor code-cleanup.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-05-29 10:54:09 +02:00
Erik Faye-Lund 00c683bc8e mesa/main: make the PRIMITIVE_RESTART_NV checks consistent
{En,Dis}ableClientState(PRIMITIVE_RESTART_NV) should only work on
compatibility contextxs. While we're at it, modernize the code a bit,
by using the extension helpers instead of open-coding.

Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
2019-05-29 10:54:09 +02:00
Samuel Pitoiset d3771ccaa3 radv: use view format when selecting the resolve path for subpasses
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-29 08:53:48 +02:00
Samuel Pitoiset 017170a785 radv: always use view format when performing subpass resolves
It makes sense to use the image view formats when resolving
inside subpasses, while we have to use the image formats for
normal resolves.

Original patch by Philip Rebohle.

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110348
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-29 08:53:46 +02:00
Samuel Pitoiset eaeaad25f7 radv: sync before resetting a pool if there is active pending queries
Make sure to sync all previous work if the given command buffer
has pending active queries. Otherwise the GPU might write queries
data after the reset operation.

This fixes a bunch of new dEQP-VK.query_pool.* CTS failures.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-29 08:47:54 +02:00
Kenneth Graunke bc273dece2 intel/decoder: Use get_state_size() over guessed counts in more cases
This makes the following packets use actual driver provided sizes rather
than guessing an arbitrary number:

  - CC_VIEWPORT
  - SF_CLIP_VIEWPORT
  - BLEND_STATE
  - COLOR_CALC_STATE
  - SCISSOR_RECT

Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
2019-05-28 13:44:16 -07:00
Mike Lothian 29ea92e6a1 meson: Link Gallium drivers with ld_args_build_id
Link all Gallium drivers with ld_args_build_id to prevent failures in
Iris that uses GNU_BUILD_ID

Bugs: https://bugs.freedesktop.org/show_bug.cgi?id=110757
Fixes: 4756864cdc "iris: Start wiring up on-disk shader cache"

Signed-off-by: Mike Lothian <mike@fireburn.co.uk>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
2019-05-28 13:37:36 -07:00
Lionel Landwerlin 366811bedb nir/lower_non_uniform: safely iterate over blocks
This fixes a problem where the same instruction gets replaced twice.
This was happening when the replaced instruction would be at the end
of a block.

Replacement of :

   if ssa_8 {
                ....
      intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */
   }

Would be :

   if ssa_8 {
      loop {
         vec1 32 ssa_47 = intrinsic read_first_invocation (ssa_44) ()
         vec1 1 ssa_48 = ieq ssa_47, ssa_44
         if ssa_48 {
            loop {
               vec1 32 ssa_49 = intrinsic read_first_invocation (ssa_44) ()
               vec1 1 ssa_50 = ieq ssa_49, ssa_44
               if ssa_50 {
                  intrinsic bindless_image_store (ssa_44, ssa_16, ssa_0, ssa_15) (5, 0, 34836, 32) /* image_dim=Buf */ /* image_array=false */ /* format=34836 */ /* access=32 */
                  break
               } else {
        ....
   }

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3bd5457641 ("nir: Add a lowering pass for non-uniform resource access")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
2019-05-28 20:23:16 +01:00
Samuel Pitoiset 47a10edefb radv: allocate more space in the CS when emitting events
If the driver waits for CP DMA to be idle and emit an EOP event
we need more space.

This fixes a crash with Quake Champions.

Cc: <mesa-stable@lists.freedesktop.org>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-28 16:56:17 +02:00
Kenneth Graunke 6a9e39d44b iris: Ask st to vectorize our IO.
(Technically this is common code, but it doesn't affect i965 or anv.)

Improves performance of GFXBench5/gl_tess_off on Skylake GT4e at 1080p
by 9.3933% +/- 0.0305157% by eliminating all spilling in the GS.

Improves performance of GFXBench5/gl_4_off (Car Chase) on Skylake GT4e
at 1080p by 0.325208% +/- 0.0842233% (n=18).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-28 01:06:48 -07:00
Kenneth Graunke c31b4420e7 st/nir: Re-vectorize shader IO
We scalarize IO to enable further optimizations, such as propagating
constant components across shaders, eliminating dead components, and
so on.  This patch attempts to re-vectorize those operations after
the varying optimizations are done.

Intel GPUs are a scalar architecture, but IO operations work on whole
vec4's at a time, so we'd prefer to have a single IO load per vector
rather than 4 scalar IO loads.  This re-vectorization can help a lot.

Broadcom GPUs, however, really do want scalar IO.  radeonsi may want
this, or may want to leave it to LLVM.  So, we make a new flag in the
NIR compiler options struct, and key it off of that, allowing drivers
to pick.  (It's a bit awkward because we have per-stage settings, but
this is about IO between two stages...but I expect drivers to globally
prefer one way or the other.  We can adjust later if needed.)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-28 01:06:48 -07:00
Mathias Fröhlich 1d0a8cf40d mesa: Prevent classic swrast crash on a surfaceless context v2.
This fixes the egl_mesa_platform_surfaceless piglit test as well
as the new egl_ext_device_base piglit test on classic swrast.

v2: Fix swrast surfaceless contexts on the driver side.

Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Mathias Fröhlich <Mathias.Froehlich@web.de>
2019-05-28 08:27:16 +02:00
Samuel Pitoiset 15cb19ed6f radv add radv_get_resolve_pipeline() in the compute path
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-28 08:17:26 +02:00
Samuel Pitoiset 469258c3b1 radv: cleanup the compute resolve path for subpass
This makes use of radv_meta_resolve_compute_image() by filling
a VkImageResolve region instead of duplicating code.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-05-28 08:17:23 +02:00
Timothy Arceri d2b0246741 radeonsi: add drirc workaround for American Truck Simulator
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110711
2019-05-28 08:47:44 +10:00
Timothy Arceri 11e16ca7ce Revert "st/mesa: expose 0 shader binary formats for compat profiles for Qt"
This reverts commit 55376cb31e.

It's been over a year and both QT 5.9.5 and 5.11.0 contained a fix for the
original issue. It seems i965 only ever applied this workaround to the
18.0 branch.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-05-28 08:46:50 +10:00
Lionel Landwerlin 2042f22e28 anv: fix apply_pipeline_layout pass for arrays of YCbCr descriptors
When using the binding tables to access arrays of YCbCr descriptors we
did not consider the offset of the accessed element. We can't do a
simple multiple because the binding table entries are tightly packed.

For example element 0 of the array could use 2 entries/planes and
element 1 could use 2 entries/planes.

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 3bb8768b9d ("anv: toggle on support for VK_EXT_ycbcr_image_arrays")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
2019-05-27 22:47:53 +01:00
Marek Olšák fccced57cf radeonsi: clean up winsys creation
- unify the code
- choose radeon or amdgpu based on the DRM version, not based on which one
  succeeds first
2019-05-27 15:26:06 -04:00
Marek Olšák bb5d82bd06 radeonsi: allow query functions for compute-only contexts 2019-05-27 15:26:06 -04:00