now we have tracking for vbo binds and can automatically reapply the correct
barrier just before draw if the resource is modified after bind
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10849>
there's not actually any reason to do these during draw other than wasting
cpu cycles, and it has the side benefit of providing a bunch more info for rebinds
image resources for gfx shaders need to have their barriers deferred until draw time,
however, as it's unknown what layout they'll require until that point due to potentially
being bound to multiple descriptor types
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10849>
the descriptor layouts are the last component needed to create the pipeline
layout, so it makes sense to streamline setup by having the pipeline layout
created from a single place for both types of pipelines
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10849>
Enable EGL_KHR_mutable_render_buffer when below conditions are met:
1. Driver part implementation of this extension is available
2. Loader part implementation of this extension is available
3. ClientAPIs must be OpenGL ES bits (ES, ES2 or ES3)
4. Gralloc is cros gralloc and it supports front render usage query
(4) is optional as long as another gralloc supports similar query.
Upon window surface creation, if the surface type has mutable render
buffer, then append the cached front rendering usage to the existing
usage and properly set to the ANativeWindow.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10685>
Tested with low-lantency stylus apps with this extension enabled, no
regression on the cts.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10685>
Unlike front buffer used by big GL API for front rendering,
EGL_KHR_mutable_render_buffer together with ES redirects GL_BACK to the
front buffer.
This patch adds a fallback to use back buffer and ensures no behavior
change for unrelated frontends.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10685>
Note that CCS isn't ambiguated during a HiZ ambiguate. Dumping the CCS
surface after a HiZ ambiguate shows that the CCS is unchanged.
Fixes: 98dc7f56b7 ("intel/isl: Add a separate ISL_AUX_USAGE_HIZ_CCS_WT")
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9112>
Instead of installing the distribution package, build and install
locally, including the tests.
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10870>
Instead of storing the vertex mask per wave into LDS and then computing
the prefix sum, store 8-bit bitcounts (vertex counts) of the vertex masks
into LDS. This allows us to compute the sum using v_sad_u8, which computes
a sum of 4 i8vec4 components in one instruction.
Each i8vec4 of vertex counts is loaded in parallel threads (one dword
per thread) instead of all being loaded in thread 0, and readlane copies
them to SGPRs instead of readfirstlane.
LDS is no longer initialized before culling. Instead, the counts for
inactive waves are masked with AND later.
Incorrect old comments are also fixed.
This change removes 80 bytes from the code size, and it allows increasing
the workgroup size from 128 to 256. (which is the main motivation for this)
Now changing the workgroup size with wave64 has no effect on the code size.
Switching to wave32 with 8 waves even generates slightly smaller code than
wave64 with 4 waves.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
I don't know which change fixed this, but I can't reproduce the hang anymore.
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
Uncached access can be slow if the box is not aligned nicely.
Also, caching in L2 might enable bigger PCIe bursts.
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
If a shader has no param exports (no varyings), the pixel shader can start
after the VS position is written before the vertex shader finishes.
The fix is to wait for the memory stores before the position export.
The code needs to be restructured. First prepare param exports to get
nr_param_exports, then emit position exports with the wait, and then
emit param exports.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>