This fixes a longstanding bug in the interaction between TS and a write
mapping. The write does not update TS regardless of the way the update
is done. Update via etna_copy_resource would just set the target ts_valid
to false without actually writing back any dirty TS to the resource.
Writes via the CPU would update the resource, but keep ts_valid at true
even if the tile status may now not match the actually written tiles of
the resource anymore.
Fix this by writing back a dirty TS to the target resource if needed
before updating the level with the write data. Always invalidate TS,
even when the update is done by the CPU.
Cc: mesa-stable
Signed-off-by: Lucas Stach <l.stach@pengutronix.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19846>
source_root() function is deprecated in Meson version 0.56.0 because
it returns the source root of the parent project if called from a
subproject.
Why would anyone need Mesa as a meson subproject?
It would be used as subproject in a project generated by command buffer
"decompiler" for Freedreno.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Eric Engestrom <eric@igalia.com>
Reviewed-by: David Heidelberg <david.heidelberg@collabora.com>
Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19901>
radeonsi use packed location base while radv use un-packed location.
So we adjust instance_rate_inputs in each driver to hide the difference.
Note the attribute slot number is less than 16, so we can shift
instance_rate_inputs in radv by VERT_ATTRIB_GENERIC0 which is 16.
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19868>
We need to set CPS_MODE_NONE when no per coarse pixel dispatch.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 231651fd89 ("anv: implement VK_KHR_fragment_shading_rate")
Reviewed-by: José Roberto de Souza <jose.souza@intel.com>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19867>
Consider the loop:
float i = 0.0;
while (true) {
if (i != 0.0)
break;
i = i + 1.0;
}
This loop clearly executes exactly one time.
Some trickery is necessary to handle cases where the initial loop value
is very large and the increment is, by comparison, very small. From the
fenu_once test case,
float i = -604462909807314587353088.0;
while (true) {
if (i != -604462909807314587353088.0)
break;
i = i + 36028797018963968.0;
}
This loop should also execute exactly once, but this is much more
challenging to calculate due to precision issues.
Going towards smaller magnitude (i.e., adding a small positive value to
a large negative value) requires a smaller delta to make a difference
than going towards a larger magnitude. For this reason,
-604462909807314587353088.0 + 36028797018963968.0 !=
-604462909807314587353088.0, but -604462909807314587353088.0 +
-36028797018963968.0 == -604462909807314587353088.0. Math class is
tough.
No changes in shader-db or fossil-db.
v2: Fix major bug in checking result of the eval_const_binop(nir_op_feq,
...) discovered while developing fneu_once_easy unit test. Fix a typo in
the comment just above that. Add fneu_once_easy test.
v3: Skip the iteration count adjustment tests for nir_op_fenu and
nir_op_ine. Since the iteration count is either 1 or unknown, all this
function can do is add numerical error. Add fenu_once tests.
v4: Change the initial value in the fneu_once test from large positive
to large negative. Change check in get_iteration from nir_op_fsub to
nir_op_fadd. Both changes from discussion with M Henning. Also add some
more explanation in fneu_once.
v5: Rename test cases.
Fixes: 6772a17acc ("nir: Add a loop analysis pass")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19732>
I discovered this problem because adding an algebraic transformation to
convert some uge and ult to ieq or ine caused a couple loops to stop
unrolling. Consider the loop:
uint i = 0;
while (true) {
if (i >= 1)
break;
i++;
}
This loop clearly executes exactly one time. Note that uge(x, 1) is
equivalent to ine(x, 0). Changing the condition to 'if (i != 0)' will
also execute exactly one time.
In the added test cases, uge_once correctly get an exact loop trip count
of 1. Without the changes to nir_loop_analyze.c, the ine_once case
detects a maximum loop trip count of zero and does not get an exact loop
trip count.
No changes in shader-db or fossil-db.
v2: Move nir_op_fneu changes to a separate commit.
v3: Rename test cases.
Fixes: 6772a17acc ("nir: Add a loop analysis pass")
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19732>
This test comes from a comment in the loop analysis code.
The ine_zero test checks that zero iteration loops involving ine are
correctly identified.
v2: Add ine_zero test. Suggested by Tim.
v3: Rename test cases.
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19732>
Fedora's gcc 12.2.1 says:
../src/egl/main/eglapi.c: In function ‘eglDupNativeFenceFDANDROID’:
../src/egl/main/eglapi.c:2268:11: warning: ‘ret’ may be used uninitialized [-Wmaybe-uninitialized]
2268 | EGLint ret;
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19913>
Regalloc can sort it out later. No significant change is shader-db, the
one instruction reduction is likely because some optimize pass can
actually work better when we are closer to ssa-like form.
RV530:
total instructions in shared programs: 133718 -> 133717 (<.01%)
instructions in affected programs: 47 -> 46 (-2.13%)
helped: 1
HURT: 0
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Tested-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19853>
Back when we had a stupid register allocator we did a lot of tricks to
optimize the register usage. The old version of rc_find_free_temporary
did a full program search each time it was called to find out what
registers and channels are actually used and than used that info to give
us the first free register to use.
Now that we have a proper register allocator both for vertex and
fragment shaders, this is no longer needed. Just scan the program when
called for the first time to find the first unused temporary index and
than increment by one everytime. Regalloc can sort it out later.
No change in shader-db confirms this assumption is sound.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Tested-by: Filip Gawin <filip@gawin.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19853>
The trick for emulating MSAA clear by adjusting blit coords tends to
fall over with tiled/ubwc, so just use the fallback path instead.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19884>
We have these native. Passes the relevant piglits. Large reduction in memory
usage on Xonotic on higher settings (8x less memory per texture), which allows
Xonotic to run at high settings without OOMing.
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Tested-by: Asahi Lina <lina@asahilina.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19903>
There seems to be a problem with running firefox by using Xwayland that
results in a shared resources being not always tagged as using staging.
As a result one process tries to map the resource that was allocated as
one that uses staging without actually using the staging resource, and
hence the mapped range only accounts for the small region that we have
to allocated because a zero-allocation doesn't work, but the application
mapping the resource assumes that a properly sized range is mapped, and
consequently this results in invalid memory access.
To work around this issue disable creating staging for resources that
are created by using shared binding. It is not clear to me whether this
is the best fix, but it seems to quell the issue.
Fixes: c9d99b7eec
virgl: Fix texture transfers by using a staging resource
Related: https://gitlab.freedesktop.org/virgl/virglrenderer/-/issues/291
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19655>
currently max number of slice(tile) parameters are copied for av1.
copy only actual number of slice parameters
Signed-off-by: Sajeesh Sidharthan <sajeesh.sidharthan@amd.com>
Reviewed-by: Boyuan Zhang <Boyuan.Zhang@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19878>
Fixes crucible test func.shader.dualsrc_mrt0_undef on polaris10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 22.3 mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19806>
Fixes crucible tests func.shader.dualsrc_mrt0_undef on navi21 and
func.shader.dualsrc_mrt1_undef on polaris10.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Cc: 22.3 mesa-stable
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19806>
binder_realloc() -> iris_bo_alloc() is setting 4096 as flags parameter.
Up to now this is harmeless as there is no BO_ALLOC flag that uses
bit 12 but is better to avoid any future issues.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19898>
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Tested-by: Filip Gawin <filip@gawin.net>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19766>
We now depend on NIR doing the right thing. It was not able to
handle the few cases where NIR failed anyway (and even if it did,
such complex cases would hit the instruction limit later).
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Tested-by: Filip Gawin <filip@gawin.net>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19766>
R300/R400 GPUs can't do it in hardware and all the lowering should have
happened in NIR already, there is no point in wasting CPU time, just to
abort later when emitting.
Reduces CPU time for dEQP run by ~25% for RV370. The wallclock time is
now just slighly above 1 minute at 10 threads, mostly determined by the
long-running dEQP-GLES2.functional.flush_finish.* tests.
Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com>
Reviewed-by: Filip Gawin <filip@gawin.net>
Tested-by: Filip Gawin <filip@gawin.net>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19766>