If we lower mem constants first, then direct array accesses to
constants never get lowered, so do a constant fold pass first to
remove direct const array accesses.
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7209>
Thanks to Felix Degrood for discovering that we missed enabling this
additional caching on Tigerlake! Felix also benchmarked the changes.
We now use MOCS 48 (HDC:L1 + L3 + LLC) for render targets, textures,
and pull constant buffers. We leave storage buffers & images, as well
as stateless messages, using the previous MOCS 2 value. We can't use
HDC:L1 with atomics, and we don't know a priori whether storage buffers
will be used with atomics or not. Similarly, the Vulkan buffer device
address feature allows atomics to be performed on buffers via stateless
messages, and we only can control MOCS at the base address level, so
we can't do much there.
This is closer to what the Windows Vulkan and OpenGL drivers do,
though it isn't quite the same - they also disable LLC in some cases,
but we observed this to have noticable performance regressions when
we tried (though a couple titles benefited). We may try experiment
with that in the future.
Improves performance in a number of titles:
- Unreal Engine 4 Shooter Demo [VK]: 11.8%
- Witcher 3 [DXVK]: 3.9%
- Rise of the Tomb Raider [VK]: 1.5%
- Shadow of the Tomb Raider [VK]: 1.0%
- Grand Theft Auto V [DXVK]: 0.8%
We did not observe any performance regressions.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
On Gen12+, we can enable additional caches in certain usage situations.
This routes that decision making to a central place in ISL, based on
surface usage flags, and updates both drivers to use it. (i965 doesn't
need to change because it doesn't support Gen12.)
We continue handling the "external" decision via an anv_mocs() wrapper
for now, since we store that flag in anv_bo, which isl doesn't know
about. (We could introduce an ISL_SURF_USAGE_EXTERNAL, but I'm not
actually sure that would be cleaner.)
This patch should not have any functional nor performance effects, as
we continue selecting the exact same MOCS values for now.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
Most uses of this function deal with destination buffers, but for
copy_buffer_to_image, the buffer is the source, and isn't rendered
to. We should avoid setting ISL_SURF_USAGE_RENDER_TARGET_BIT.
Also, we should avoid setting ISL_SURF_USAGE_TEXTURE_BIT for the
destination, which isn't sampled from.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7104>
Given that we already link to Android's libsync, use it instead of using a
build-time dependency on libdrm for the KGSL path. This also would help
for older kernel compat with KGSL.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6821>
The downstream Android kernels had a different API than was merged
upstream, and libsync on Android abstracts over that for us. Use their
sync_merge() and sync_wait(), at the cost of linking against libsync
(which Android.mk and meson both do).
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6821>
libsync.h is one of the common dependencies on libdrm from drivers that
otherwise don't want libdrm. Given that it's header-only code, just
import it to Mesa instead of forcing library dependencies on our users.
Copied from libdrm commit a84caff71be9 (" intel: Add PCI ID support to RKL
platform")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6821>
In SDK 25 (Android 7, 2016), the kernel only went up to around 4.4, so
the upstreamed sync API didn't exist, and libsync didn't expose
sync_merge().
In SDK 26 (Android 8, 2017), the kernel is sometimes bumped to 4.9 or
4.14, but libsync has a sync_merge() that operates on either the
downstream or the upstream API.
Since our android-targeting drivers in general use sync objects (requiring
a kernel newer than 7 got) and sync_merge() (suggesting interaction with
external fencing that started in Android 8), make CI build against an SDK
version with the sync API. I think really doing SDK 25 right at this
point would involve backporting mesa with some #ifdefs to not expose sync
APIs that wouldn't work.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Acked-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6821>
This header is used in anv and radv, and soon turnip. Since the script
just checks out master, this also bumps the headers to upstream
02dfcc7c1562 ("Merge "Merge Android R"")
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6821>
loader_dr3_helper.c uses xcb_xfixes_create_region() that requires dep_xcb_xfixes to link. This is dependent on with_platform_x11 and with_dri3.
But the source meson file does not set this up dependent on with_dri3.
The build was initialsed using platforms=x11 and gallium-drivers=zink,swrast.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7164>
It's a bit hard to exactly map our implementation to the limits
described by Vulkan. The bindless surface handle in the extended
message descriptors is 20 bits and it's an index into the table of
RENDER_SURFACE_STATE structs that starts at bindless surface base
address. This means that we can have at must 1M surface states
allocated at any given time. Since most image views take two
descriptors, this means we have a limit of about 500K image views.
However, since we allocate surface states at vkCreateImageView time,
this means our limit is actually something on the order of 500K image
views allocated at any time. The actual limit describe by Vulkan, on
the other hand, is a limit of how many you can have in a descriptor set.
Assuming anyone using 1M descriptors will be using the same image view
twice a bunch of times (or a bunch of null descriptors), we can safely
advertise a larger limit. 1M is what's required by D3D12, so let's
advertise that.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3335
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7180>
Zink doesn't support forwarding DRM modifiers yet, so whenever those are
used, we end up ignoring them. That's not going to do the right thing in
most cases, so let's reject them instead.
Since d686835171, the dri2 code tries to create a 0x0 surface without
any format when trying to import. This makes this go from
rendering-issues to asserting in debug builds, making things even worse.
Fixes: d686835171 ("gallium/dri2: Support I915_FORMAT_MOD_Y_TILED_GEN12_MC_CCS import")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3654
Reviewed-by: Daniel Stone <daniels@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7214>
According to the Vulkan specification, the
VK_COMMAND_BUFFER_USAGE_RENDER_PASS_CONTINUE_BIT flag will be ignored if
included in a VkCommandBufferBeginInfo for a primary command buffer.
This also implies pBeginInfo->pInheritanceInfo should not be read even
if the flag is present, and makes it legal to include the flag knowing
it will be ignored.
Signed-off-by: Ricardo Garcia <rgarcia@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7128>
There is a barrier between lds_pos_cull_* use and lds_pos_* use, so they
can occupy the same LDS space.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7172>
The system values are written into LDS after the new thread ID is known,
so it removes pointer indirection with the old thread ID.
Also, the LDS stores are skipped entirely if vertices are culled.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7172>
If we store the position into LDS after we know the new thread ID,
we don't need to remember the old thread ID.
The culling code only needs W, X/W, Y/W, so we have to keep those.
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7172>
Fix defect reported by Coverity Scan.
Uninitialized scalar variable (UNINIT)
uninit_use_in_call: Using uninitialized value time.tv_sec when calling free_stale_bos.
Fixes: f78c99f357 ("v3dv/bo: add a maximum size for the bo_cache and a envvar to configure it")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7154>
index is of type uint32_t.
Fix defect reported by Coverity Scan.
Macro compares unsigned to 0 (NO_EFFECT)
unsigned_compare: This greater-than-or-equal-to-zero comparison of
an unsigned value is always true. index >= 0U.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7153>
other than the vaguely gross case of primitive restart with incompatible
draw modes and/or restart index, this is no trouble since the buffer formats
are compatible
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7191>