Back when SSBOs were first enabled in i965, we tried to work around
issues where the CPU and GPU were incoherently writing to the same
cacheline by forcing an alignment such that different sections of
data would fall in different cachelines. This seems wrong.
On integrated GPUs with LLC, CPU and GPU writes should be coherent.
On integrated GPUs without LLC, we either enable snooping (so they
are again coherent), or we use WC maps (so the CPU cache isn't used).
Discrete GPUs always use WC maps (so the CPU cache isn't used).
This should work. In other words, I think the increased alignment
was just working around coherency problems on atoms that have been
fixed in the intervening 6 year time period.
Untyped surface messages require 4B alignment.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5016
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11727>
This is rarely useful, but after the next patch removes tiling tracking,
this would literally be the only difference between iris_bo_alloc and
iris_bo_alloc_tiled, so we may as well add it.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11169>
Based on a patch by Rafael Antognolli.
We already had a flags parameter, but omitted it from the simple alloc
interface because most callers were passing 0. However, we'll want to
use it for selecting between device local memory and system memory, and
possibly mmap cacheability modes, in the future. At that point, many
more callers will want to specify, so I think we should include flags
in iris_bo_alloc() as well.
A few places used the iris_bo_alloc_tiled() function simply to pass
flags, so this patch converts them to use iris_bo_alloc() instead now
it does everything they want.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11169>
This is enabled by enabling gallium's memobj capability.
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4337>
Enabling this fast path, while making CPU side simpler
(and thus reducing driver's CPU overhead), forces us to
generate multiple VERTEX_BUFFER_STATE packets pointing to
the same buffer, but with slightly shifted BufferStartingAddress.
This is fine on GFX version >= 11 and < 8, but on version
8 and 9, thanks to internal details of the VF cache, vertex
data from each VERTEX_BUFFER_STATE is cached separately
and this results in a lot of cache misses.
Disabling this fast path restores previous performance levels.
On my SKL GT2 machine this improves performance in:
- GLB27 Egypt offscreen by 37%
- GLB27 TRex offscreen by 22%
- gfxbench5 Manhattan offscreen by 1.75%
- gfxbench5 Manhattan31 offscreen by 1.9%
- gfxbench5 Aztec Ruins high by 2.3%
In Intel performance CI on GFX version 9 it improves performance in:
- gfxbench5 Manhattan offscreen by 1.65%
- gfxbench5 Aztec Ruins normal offscreen by 1.33%
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3277
Fixes: 42842306d3 ("mesa,st/mesa: add a fast path for non-static VAOs")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9857>
gen_device_info will try to use the most recent uAPI to get this
number and will fallback to this get_param. So just use what was
queries by gen_device_info_from_fd().
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9052>
This patch renames functions, structures, enums etc. with "gen_"
prefix defined in common code.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
Changes in this patch include:
- Rename all files in src/intel/common path
- Update the filenames used in source and build files
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
We should be exposing it in every driver, since it's required eventually
to reduce jank. Make drivers have to explicitly opt out instead of opt
in.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9088>
These hooks were written in the initial IRIS_MEASURE implementation.
Minor changes by Mark Janes <markjanes@swizzler.org> to adapt to the
INTEL_MEASURE reimplementation.
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7354>
Now that we store shader variants in the objects themselves rather
than a per-context hash table, they are actually global across
contexts. We can enable this feature.
This makes shaders shared across contexts, so apps can compile in
one and use it in another. This has always been allowed by GL,
but in the past we've simply recompiled the shaders in every context,
which is slow and painful.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7668>
It's hooked up in all the pipe wrapper drivers, and all the
frontends except a couple places in glx/xlib.
This enables a more efficient path for drivers which use
swrast's Present, but hardware rendering (e.g. d3d12, zink).
Reviewed-by: Dave Airlie <airlied@redhat.com>
Acked-by: Marek Olák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8045>
Screen is shared among contexts, other context might be already using
vtbl while another initializes it again.
==45872== Possible data race during write of size 8 at 0x5DDAE78 by thread #549
==45872== Locks held: 1, at address 0x5D1B6F8
==45872== at 0x6D66D91: gen9_init_state (iris_state.c:7816)
==45872== by 0x6BA0A31: iris_create_context (iris_context.c:342)
==45872== by 0x621F390: st_api_create_context (st_manager.c:917)
==45872== by 0x620E6F9: dri_create_context (dri_context.c:163)
==45872== by 0x6A40DB1: driCreateContextAttribs (dri_util.c:480)
==45872== by 0x540B963: dri2_create_context (egl_dri2.c:1583)
==45872== by 0x53FB84E: eglCreateContext (eglapi.c:821)
==45872==
==45872== This conflicts with a previous read of size 8 by thread #544
==45872== Locks held: 1, at address 0x5F6E0E0
==45872== at 0x6CB779E: blorp_alloc_binding_table (iris_blorp.c:167)
==45872== by 0x6CAEF70: blorp_emit_surface_states (blorp_genX_exec.h:1540)
==45872== by 0x6CB67F9: blorp_exec (blorp_genX_exec.h:2016)
==45872== by 0x6CB7AFE: iris_blorp_exec (iris_blorp.c:307)
==45872== by 0x70F5916: try_blorp_blit (blorp_blit.c:2145)
==45872== by 0x70F5FCA: do_blorp_blit (blorp_blit.c:2273)
==45872== by 0x70F778F: blorp_copy (blorp_blit.c:2803)
==45872== by 0x6BB9EB6: iris_copy_region (iris_blit.c:725)
v2: move as genX(init_screen_state) (Lionel)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7544>
This enables GL_EXT_semaphore feature.
v2:
* reversed previous commit that was conditionally setting the signal
fence capability if the syncobj was present
* reversed previous commit that was introducing a bool has_syncobj that
is not necessary anymore
v3:
* changed the signal function to use fence->seqno due to recent changes
to master
v4:
* changed the signal callback to use the new structs of the fences
backend (iris_fine_fence)
v5:
* removed check for ctx == NULL in iris_fence_signal and await functions
as at the time they are called we always have a context
* splitted a line to not exceed width
v6:
* put back the if(ctx) check in iris_fence_await, if this is an error
the fix should be in a different MR
Signed-off-by: Eleni Maria Stea <estea@igalia.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7042>
Now that we have the HDC, using the data cache for UBO pulls seems to
help things quite a bit:
GTA V DXVK 104.0%
Talos Principle GL 102.8%
Rise of Tomb Raider VK 102.8%
Dark Souls 3 DXVK 101.4%
Witcher3 DXVK 101.3%
Bioshock Infinite GL 100.5%
Doom 2016 VK 97.7%
Doom is a bit of a loss but it helps enough other stuff, it's probably
worth the hit.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7230>
This commit enables clover support for iris. It is intended as a
compiler developer tool and not as a new OpenCL implementation from
Intel. If you want competent OpenCL, we have a different open-source
driver for that built on our LLVM-based IGC compiler stack. However,
using clover with iris is becoming increasingly useful as a compiler
development tool and I'm getting tired of carrying the patches in a
private branch.
By default, clover will not initialize on iris. To enable clover, set
the IRIS_ENABLE_CLOVER environment variable to "1" or "true". As we've
done with the semi-sketchy platform support in ANV, it dumps a very loud
WARNING to stderr when enabled. Use at your own risk.
NOTE: To anyone intending to benchmark this, the performance is going to
be terrible and that is expected. This is in no way representative of
the Intel/NIR compiler stack. As it currently stands, clover passes
-O0 to clang when compiling OpenCL C to make SPIRV-LLVM-Transator work.
When compiling the SPIR-V, clover currently doesn't run any NIR
optimizations before it lowers memory access so any NIR optimizations
iris attempts to do are severely hampered. One day, clover will get a
NIR optimization loop or the ability to hand things off to the driver
per-lowering but today is not that day.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7047>
Use the same generators as used in anv driver so both Vulkan and OpenGL
drivers can share the same external memory objects.
v2: removed extra parameter from function gen_uuid_compute_device_id
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Signed-off-by: Eleni Maria Stea <estea@igalia.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7025>
It's included in declaration of INTEL_DEBUG.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6732>
Both of these are clover-only caps. We don't really support clover and,
even if we did, the number of address bits is wrong and we definitely
don't support the CL path for images.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6405>
All drivers that support mediump lowering should support 16BIT_TEMPS,
but some do not also want 16b consts to be lowered. Replace the pipe
cap in preperation to remove LowerPrecisionTemporaries.
Note: also updates reference checksums for the arm64_a630_traces job,
due to lowering more to 16b
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6189>
Adds PIPE_CAP_PRIMITIVE_RESTART_FIXED_INDEX which is a subset of the
primitive restart cap for when the hardware can only support the fixed
indices specified in GLES.
The switch statements were automatically modified with this command:
find \( \( -name \*.cpp -o -name \*.c \) \! -type l \) \
-exec sed -i -r \
's/^(\s*case\s+PIPE_CAP_PRIMITIVE_RESTART)\s*:.*$/\0\n\1_FIXED_INDEX:/' \
{} \;
v2: Add a note in screen.rst
Reviewed-by: Eric Anholt <eric@anholt.net> (v1)
Reviewed by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5559>
We reuse DRM file descriptors internally. Therefore when we export a
GEM handle we must do so in the file descriptor used externally.
This change also fixes a file descriptor leak of the FD given at
screen creation.
v2: Don't bother checking fd equals, they're always different
Fix dmabuf leak
Fix GEM handle leaks by tracking exported handles
v3: Check os_same_file_description error (Michel)
Don't create multiple exports for a given GEM table
v4: Add WARN_ONCE (Ken)
Rename external_fd to winsys_fd
v5: Remove export lock in favor of bufmgr's
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2882
Fixes: 7557f16059 ("iris: share buffer managers accross screens")
Tested-by: Eric Engestrom <eric@engestrom.ch>
Tested-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4861>
The CS compiler now produces multiple SIMD variants, so the previous
trade-off between "always using SIMD32" and "having a smaller max
invocations" is now gone.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5142>
A buffer added to all execbufs so that we can attribute a batch that
caused a hang to a particular driver.
v2: Reuse workaround BO
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3203>
We will later use the devinfo from iris_bufmgr, where we don't have
access to the screen pointer. And since we are moving it, we can reuse
it in Anv and i965.
v2: return error code and check for it on Anv (Lionel).
v3: Remove anv_gem_get_aperture() from anv_private.h and stubs (Lionel).
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5043>
this allows passing scissored clear calls through the driver where it can
be handled by a repclear shader
fixkwg/mesa#61
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4310>
It was implemented in 1df871f8ff, but to
really enable it we need to enable PIPE_CAP_DEPTH_BOUNDS_TEST.
v2: Add release notes (Ian).
Suggested-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4540>
Because St creates resources from a screen and attach them onto
another we need to ensure the resources associated to a screen &
bufmgr stay around until we don't need them anymore.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1373
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4086>
St happilly uses pipe_resources created with one screen with other
screens. Unfortunately our resources have a single identifier that
related to a given screen and its associated DRM file descriptor.
To workaround this, let's share the buffer manager between screens for
a given DRM device. That way handles are always valid.
v2: Don't forget to close the fd that bufmgr now owns
Take a copy of the fd to ensure it stays alive even if the dri
layer closes it
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: <mesa-stable@lists.freedesktop.org>
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1373
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4086>