according to spec, this is supposed to handle fragment shader fetch
from previous draw output, not color output readback from previous
color output write
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10706>
this is much easier to read and is going to greatly simplify the eventual
multidraw implementation which will be dropped in
also it allows moving conditionals outside of loops to very slightly improve
drawoverhead performance (with multidraw)
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10701>
instead of manually creating the ivci, there's already a util function for
that which will handle everything
also only set layer info if there are multiple layers
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10703>
The current policy is to always favor accumulators if possible, however,
this is not always optimal.
Particularly, accumulators play a crucial role in enabling QPU instruction
merges, since these are limited to both the ADD and the ALU instructions
addressing at most 2 physical registers. For 2-src instructions, this means
that to be able to merge we need them to address at least 2 accumulators.
While favoring accumulators does help the case for instruction merges in
general, it is risky to assign accumulators to variables that have
long life spans. Doing so will make the accumulator unavailable for
any other instructions during that life span, and since we only have a few
accumulators, we can quickly run out and losing our capacity to merge
instructions for large parts of the qpu program.
On the other hand, we also want to avoid the extreme case were we keep
allocating physical registers to the point we run out, even if we have
accumulators available, since accumulators have additional restrictions
and may not be suitable for everything.
This change continues the policy of favoring accumulators, but it only
does so if the life span of the temps is short, to ensure that we can
recycle accumulators often across instructions and avoid running out
for sections of the QPU code, unless we are already running out of
physical registers.
total instructions in shared programs: 13654647 -> 13336921 (-2.33%)
instructions in affected programs: 11015919 -> 10698193 (-2.88%)
helped: 39758
HURT: 17325
Instructions are helped.
total threads in shared programs: 412046 -> 412038 (<.01%)
threads in affected programs: 16 -> 8 (-50.00%)
helped: 0
HURT: 4
Threads are HURT.
total uniforms in shared programs: 3745726 -> 3746003 (<.01%)
uniforms in affected programs: 17296 -> 17573 (1.60%)
helped: 76
HURT: 99
Uniforms are HURT.
total max-temps in shared programs: 2364430 -> 2359942 (-0.19%)
max-temps in affected programs: 109117 -> 104629 (-4.11%)
helped: 2893
HURT: 772
Max-temps are helped.
total spills in shared programs: 5727 -> 5746 (0.33%)
spills in affected programs: 221 -> 240 (8.60%)
helped: 1
HURT: 2
total fills in shared programs: 13121 -> 13139 (0.14%)
fills in affected programs: 466 -> 484 (3.86%)
helped: 1
HURT: 2
total sfu-stalls in shared programs: 33432 -> 34491 (3.17%)
sfu-stalls in affected programs: 18219 -> 19278 (5.81%)
helped: 4459
HURT: 5087
Inconclusive result
total inst-and-stalls in shared programs: 13688079 -> 13371412 (-2.31%)
inst-and-stalls in affected programs: 11030017 -> 10713350 (-2.87%)
helped: 39630
HURT: 17429
Inst-and-stalls are helped.
total nops in shared programs: 335753 -> 333708 (-0.61%)
nops in affected programs: 112659 -> 110614 (-1.82%)
helped: 8726
HURT: 7383
Inconclusive result
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10686>
Make progType a constructor argument.
Fix defect reported by Coverity Scan.
Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member progType is not initialized
in this constructor nor in any functions that it calls.
Suggested-by: Ilia Mirkin <imirkin@alum.mit.edu>
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10562>
Fix defect reported by Coverity Scan.
Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member tail is not initialized in
this constructor nor in any functions that it calls.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10561>
Fix defect reported by Coverity Scan.
Dereference before null check (REVERSE_INULL)
check_after_deref: Null-checking bsp_bo suggests that it may be
null, but it has already been dereferenced on all paths leading
to the check.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10564>
Fix defect reported by Coverity Scan.
Uninitialized scalar field (UNINIT_CTOR)
uninit_member: Non-static class member tag is not initialized in
this constructor nor in any functions that it calls.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10563>
Fixes the following error on Ubuntu 18.04 with GCC 7.
src/gallium/drivers/nouveau/codegen/nv50_ir_from_tgsi.cpp:1448:72:
sorry, unimplemented: non-trivial designated initializers not supported
info_out->prop.cp.gmem[gmemSlot++] = {.valid = 1, .slot = i};
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Fixes: f451854f3 ("nv50: add remapping of buffers/images into unified space")
Signed-off-by: Ernst Sjöstrand <ernstp@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10668>
Blending in SP_BLEND_CNTL is not a binary flag but the same
mask as in RB_BLEND_CNTL. It is a per-mrt enable bit for blending.
Copied form a6xx, on a5xx it should be have the same since it seems
to have the same structure layout.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10682>
Blending in SP_BLEND_CNTL is not a binary flag but the same
mask as in RB_BLEND_CNTL. It is a per-mrt enable bit for blending.
Example SP_BLEND_CNTL produced by blob on a630 and different MRT
blendings:
SP_BLEND_CNTL: { UNK8 | 0x6 }
SP_BLEND_CNTL: { ENABLED | UNK8 | 0xe }
(Decoded before this commit)
Fixes mis-rendering with D3D11 game "Spelunky 2".
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10682>
We can do stuff like this:
vec1 32 ssa_207 = fsum3 ssa_209.xxx
In this case, we'd end up not swizzling in get_alu_src, and reading
components out-of-bounds, which LLVM isn't very happy about, and thus
takes punitive actions, in the form of a segfault.
We don't want that, and we already know from the opcode what the
component counts should be here.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10690>
This is undefined behaviour in the hardware but perfectly legal NIR, so
lower to an if ladder predicated on the lane ID (the blob's preferred
strategy).
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10022>
New dynamic states added for VK_EXT_extended_dynamic_state2 causes
GPU hangs with vkd3d-proton.
Fixes: 7bdd569d7e ("radv: extend the dirty bits to 64-bit")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10693>
This pass was originally developed for Panfrost, where it passes the
relevant dEQP tests. Upstreaming so it can be extended and then shared
with:
* Asahi, for blending
* Zink, for logic ops
* Lavapipe, for advanced blending
Note that using this with MRT in a fragment shader (as non-panfrost
drivers will) has not yet been tested. Logic ops with integer
framebuffers are probably todo. It's been enough for Panfrost, will
suffice for ES2 on Asahi, and provides an upstream base for kusma's work
on advanced blending, so overall the merge is a net benefit.
v2: Remove bogus assert that the format layout is PLAIN. We need to
render R11G11B10, which Mesa reports as layout OTHER. The code is still
correct.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> [v1]
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10601>