I've read the papers on EWA filters and it seems like the calculate
DDQ = 2 * A after the scaling of A happens. This seems to make
things less blurry and more like real aniso.
Fixes: 2135aba8 ("softpipe: Constify variables")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11917>
We only actually use 4-bits, so we could shrink again. But this by
itself means 1/2 the memory usage for liveness analysis and 1/2 the
copying/alloc/free.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11936>
Gets rid of the silly inheritance everywhere, which has caused _far_
more problems in practice than it has fixed. It was an idea I tried
before the pandemic. It didn't work. I'm finally cleaning it up.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11936>
Since p_elect doesn't have any operands, ACO's value numbering and/or
the pre-RA optimizer could currently recognize two p_elect instructions
in two different blocks as the same.
This patch adds exec as an operand to p_elect in order to achieve
correct behavior.
Fixes: e66f54e5c8Closes: #5080
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11943>
This replaces some new/delete uses with malloc/free.
This is more consistent with most of the other glsl IR code but
more importantly it allows the game "Battle Block Theater" to
start working on some mesa drivers. The game overrides new and
ends up throwing an assert and crashing when it sees this
function calling new [0].
Note: The game still crashes with radeonsi due to similar conflicts
with LLVM.
CC: mesa-stable
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11907>
if ctx->scratch_buffer is NULL, then no need to update SPI_TMPRING_SIZE
size register.
Signed-off-by: Yogesh mohan marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11900>
Whenever scratch buffer is allocated, current spi_tmpring_size and
previous spi_tmpring_size cannot be same and hence scratch_state will be
set dirty as part of "if (spi_tmpring_size != sctx->spi_tmpring_size)".
Removing redundant dirty bit sat while allocating scratch buffer.
Signed-off-by: Yogesh mohan marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11900>
We failed to translate the target type, which virgl needs for translation.
Also the read_only flag is for consts, shader inputs, and uniforms. The
access flag gives you the readonly qualifier.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11916>
The only reason we had to refcount semaphores was for the ancient
sync_file semaphores which we used for pre-syncobj kernels. Now that we
assume syncobj and that code is gone, we don't need reference counting
anymore either.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9777>
Sync object for i915 support has been in upstream Linux since 4.14 which
is 3.5 years old at this point and, as far as we can tell, it also
exists in all the ChromeOS kernels. Assuming it allows us to drop some
of our more gnarly synchronization fall-back paths.
At the time of merge, ChromeOS was on the following kernels:
- kernel 3.18: SKL
- kernel 4.4: BYT, KBL, APL
- Kernel 4.14: BDW, GLK
All of the pre-4.14 kernels have had syncobj support back-ported.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9777>
This patch also adds has_iadd3 bit to give more control if backend
supports ternary add instruction or not.
v2:
- Add patterns in late optimization (Connor Abbott)
Suggested-by: Alyssa/Jason
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>
This patch restructure code a little bit to check if source can be
represented as immediate operand. This is a foundation for next patch
which add checks for integer operand as well.
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11596>
this requires more storage in the viewport struct, but it avoids
the need to repeatedly calculate the same transform if e.g., a meta
operation occurs, which can save about 5% cpu in some cases
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11839>
This helps NGG GS and culling shaders.
No Fossil DB changes without NGG culling.
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11458>
Implement elect using a pseudo-op which is lowered during the
insert_exec_mask pass. This makes it possible to emit a more
optimal sequence when the exec mask is constant.
Fossil DB results on Sienna Cichlid:
Totals from 211 (0.16% of 128647) affected shaders:
CodeSize: 2254356 -> 2240468 (-0.62%); split: -0.62%, +0.00%
Instrs: 438471 -> 434996 (-0.79%); split: -0.80%, +0.01%
Latency: 2717082 -> 2709400 (-0.28%); split: -0.28%, +0.00%
InvThroughput: 566987 -> 566342 (-0.11%); split: -0.11%, +0.00%
Copies: 40058 -> 40162 (+0.26%)
Branches: 31209 -> 31211 (+0.01%)
PreSGPRs: 9927 -> 10125 (+1.99%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11458>
This allows our optimizer to recognize this and eliminate it when
it can prove that the s_and with exec is unneeded.
Fossil DB changes on Sienna Cichlid:
Totals from 1969 (1.53% of 128647) affected shaders:
CodeSize: 9468228 -> 9469348 (+0.01%); split: -0.00%, +0.01%
Instrs: 1773566 -> 1773581 (+0.00%); split: -0.01%, +0.01%
Latency: 19504042 -> 19503385 (-0.00%); split: -0.00%, +0.00%
InvThroughput: 3617406 -> 3617333 (-0.00%)
Copies: 108998 -> 110592 (+1.46%)
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11458>
Error handling with DRM_IOCTL_I915_QUERY is tricky and we got it wrong
in one of the two calls here. Use the common helper instead. This also
fixes a theoretical bug where calloc() fails. While we're here, inline
iris_bufmgr_update_meminfo because we're not really benefiting from
having it separate anymore.
Fixes: e60114b2ae "iris/bufmgr: Query memory region info."
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
Error handling with DRM_IOCTL_I915_QUERY is tricky and we got it wrong
in one of the two calls here. Use the common helper instead. This also
fixes a theoretical bug where calloc() fails. While we're here, inline
anv_track_meminfo because we're not really benefiting from having it
separate anymore.
Fixes: 65e8d72bc1 "anv: Query memory region info"
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
We also add a helper which contains the standard query+alloc+query
pattern used by anv_gem_get_engine_info(). The caller is required to
free the pointer.
These are declared static inline not because we care about the
performance of these helpers but because we're going to use them in the
intel_device_info code and we don't want a link dependency.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
DRM_IOCTL_I915_QUERY is a multi-query. The most egregious errors are
returned via the usual ioctl error mechanism but there are also
per-query errors that are indicated by item.length < 0. We need to
handle those as well. While we're at it, scrape errno so we can return
a proper integer error.
Fixes: c0d07c838a "anv: Support i915 query (DRM_IOCTL_I915_QUERY)..."
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
NGG culling is not compiled into shaders that can use multiple
viewports, so it's not necessary to check it here.
Fixes: 9a95f5487f
Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11910>