Frequently, we want to iterate over the SSA variables read by an instruction,
while skipping over constants and uniforms. Add and use a macro for this
pattern. A few redundant asserts are removed.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Now that variables have flat naming (no more SSA/reg distinction), we can just
use .value directly. This cleans up the RA a bit.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Now that non-SSA nodes are an internal implementation detail of RA, the
entrypoints should be made private to make sure no other passes use the
RA-specific liveness analysis.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
We've already broken SSA irreparably at this point. We can just change what
"normal" means and avoid the disjointed indexing.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
nir_invalidate_divergence was introduced in aa765c54bd ("pan/bi: Add divergent
intrinsic lowering pass"), which needed divergence information for a late NIR
pass but not otherwise in the backend. Unfortunately, nir_convert_from_ssa is
less aggressive about coalescing when divergence information makes it look like
the coalescing would make the code worse -- even though that's not actually an
issue on Mali.
Now that we don't use nir_convert_from_ssa, we don't need the hack.
Likewise for the vec.reg hack, which was introduced because the split/collect
cache relies on SSA form.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Don't call nir_convert_from_ssa, preserve the SSA from NIR. This gets us real
SSA for all frontend passes. The RA becomes a black box that takes SSA in and
outputs register allocated code out. The "broken" SSA form never escapes RA,
which means we can clean up special cases in the rest of the compiler. It also
gets us ready for exploiting SSA form in the RA itself.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Adapted from NIR's liveness analysis. This is different from our non-SSA
liveness pass for a few reasons:
1. It must handle phi nodes. This implies significant changes to the worklist
algorithm.
2. It only handles SSA. It doesn't need funny labelling schemes for
handling nir_registers in parallel with SSA defs.
3. It is scalar-only. The vector liveness information isn't interesting when
vectors are handled via COLLECT and SPLIT. This means it uses a bitset (uses
8x less memory to store livenss information, should be easier on the caches
too).
Eventually, this will become our only pre-RA liveness pass. For now, both passes
are maintained in parallel: the SSA pass used before out-of-SSA, the non-SSA
pass used after out-of-SSA and before RA, and the post-RA pass used after RA.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Pre-RA scheduling can move instructions around in nontrivial ways, yet it must
maintain the IR's ordering invariants (for preloads, phis, etc.) Running
validation before and after makes scheduler bugs more obvious.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
We can get BRANCHZ.i16 since we lower JUMP early. This seems to have worked
before mainly by chance. With the change to how we optimize, we can get code
sequences like:
block2 {
BRANCHZ.i16.eq u256, u256 -> block5
BRANCHZ.i16.eq u256, u256 -> block4
} -> block5 from block1
which would choke the assert.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
The new SSA-based dead code elimination won't know how to deal with code of that
funny form. In actuality it doesn't have to, so we just make sure the optimizer
tests produce valid IR so this works as expected.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Now that we have accurate source/dest counts, we don't need to worst case. Which
is good, because the worst case will become infinite once we allow phis.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
This is yet another case where we add a source, which will require reallocation.
It's easy enough to rebuild the instruction here.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
..where necessary to avoid appending sources. This removes a bunch more cases
where we would need to reallocate sources. The lifetime management is simpler if
we just reallocate the whole instruction. In theory this is a slight overhead
but it's probably worth it to avoid the complexity otherwise.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
When we add a source, we need to build a new instruction (or at the very least
reallocate sources). This is less of a hack anyway.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
Changing I->nr_srcs or I->nr_dests directly is generally unsafe, but the special
case of removing sources/destinations at the end is safe. Add and use helpers to
wrap this operation simplifying the remaining code audit before we can
dynamically allocate sources/destinations.
At this point in the series, nothing modifies I->nr_dests after allocation
except these helpers, so destinations should be safe to make dynamic. There's a
bit more work needed for sources.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
This avoids the nr_srcs/nr_dests setting dance, and will allow the builder to
handle the required memory allocation when we switch to dynamic src/dest
allocation.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
This will allow pseudo instructions to be allocated in a cleaner way when we
dynamically allocate their sources/destinationa.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
We guarantee this now, no need to check it in every pass. There's an exception
on Bifrost after clause scheduling, but few passes run after clause scheduling
so this doesn't affect much.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>
There are two "shapes" of TEXC in the IR:
* Regular texturing. This TEXC writes a single set of staging registers.
* Dual texturing. This TEXC writes two sets of staging registers.
Currently we model both with a 2-destination TEXC, with a null second
destination for the usual case where dual texturing isn't used. This is awkward.
To make the "shapes" of instructions more predictable, make TEXC only write a
single set of staging registers (like the hardware instruction) and split off a
TEXC_DUAL pseudoinstruction for the second case, lowered late.
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17794>