Commit Graph

443 Commits

Author SHA1 Message Date
Alyssa Rosenzweig bb6c43027e agx: Reserve live-in regs at the start of block
...Rather than reserving the union of the registers live-out of the
predecessors. This avoids reserving registers that are killed along a
control flow edge (where the predecessor has another successor that does
use the register).

glmark2 subset of shaderdb:

total instructions in shared programs: 6442 -> 6440 (-0.03%)
instructions in affected programs: 42 -> 40 (-4.76%)
helped: 1
HURT: 0

total bytes in shared programs: 42186 -> 42174 (-0.03%)
bytes in affected programs: 270 -> 258 (-4.44%)
helped: 1
HURT: 0

total halfregs in shared programs: 1769 -> 1757 (-0.68%)
halfregs in affected programs: 75 -> 63 (-16.00%)
helped: 3
HURT: 0
helped stats (abs) min: 4.0 max: 4.0 x̄: 4.00 x̃: 4
helped stats (rel) min: 16.00% max: 16.00% x̄: 16.00% x̃: 16.00%

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig de6e11b848 agx: Pass in max regs as a paramter to RA
This will allow us to restrict max regs later.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig 68f89d4cc5 agx: Introduce ra_ctx data structure
We have more parameters to pass, this will get unwieldly otherwise.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig bcb2cf9688 agx: Write to r0l with a "nesting" instruction
This avoids modeling the r0l register explicitly in the IR, which would
complicate RA for little benefit at this stage. Do the simplest thing
that could possibly work in SSA.

glmark2 subset.

total instructions in shared programs: 6442 -> 6442 (0.00%)
instructions in affected programs: 701 -> 701 (0.00%)
helped: 4
HURT: 5
helped stats (abs) min: 1.0 max: 3.0 x̄: 2.00 x̃: 2
helped stats (rel) min: 1.46% max: 7.69% x̄: 4.03% x̃: 3.48%
HURT stats (abs)   min: 1.0 max: 3.0 x̄: 1.60 x̃: 1
HURT stats (rel)   min: 0.81% max: 7.41% x̄: 2.67% x̃: 1.14%
95% mean confidence interval for instructions value: -1.58 1.58
95% mean confidence interval for instructions %-change: -3.70% 3.08%
Inconclusive result (value mean confidence interval includes 0).

total bytes in shared programs: 42196 -> 42186 (-0.02%)
bytes in affected programs: 7768 -> 7758 (-0.13%)
helped: 8
HURT: 5
helped stats (abs) min: 2.0 max: 18.0 x̄: 7.25 x̃: 4
helped stats (rel) min: 0.13% max: 7.26% x̄: 2.02% x̃: 0.97%
HURT stats (abs)   min: 6.0 max: 18.0 x̄: 9.60 x̃: 6
HURT stats (rel)   min: 0.82% max: 6.32% x̄: 2.37% x̃: 1.02%
95% mean confidence interval for bytes value: -7.02 5.48
95% mean confidence interval for bytes %-change: -2.30% 1.63%
Inconclusive result (value mean confidence interval includes 0).

total halfregs in shared programs: 1926 -> 1769 (-8.15%)
halfregs in affected programs: 1395 -> 1238 (-11.25%)
helped: 71
HURT: 0
helped stats (abs) min: 1.0 max: 10.0 x̄: 2.21 x̃: 2
helped stats (rel) min: 1.92% max: 52.63% x̄: 15.33% x̃: 11.76%
95% mean confidence interval for halfregs value: -2.69 -1.73
95% mean confidence interval for halfregs %-change: -17.98% -12.68%
Halfregs are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig c9a96d4615 agx: Preload vertex/instance ID only at start
This means we don't reserve the registers, which improves RA
considerably. Using a special preload psuedo-op instead of a regular
move allows us to constrain semantics and gaurantee coalescing.

shader-db on glmark2 subset:

total instructions in shared programs: 6448 -> 6442 (-0.09%)
instructions in affected programs: 230 -> 224 (-2.61%)
helped: 4
HURT: 0

total bytes in shared programs: 42232 -> 42196 (-0.09%)
bytes in affected programs: 1530 -> 1494 (-2.35%)
helped: 4
HURT: 0

total halfregs in shared programs: 2291 -> 1926 (-15.93%)
halfregs in affected programs: 2185 -> 1820 (-16.70%)
helped: 75
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig f665229d77 agx: Print agx_dim appropriately
Easier to read, and gets us closer to proper disasm in Mesa.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig 6c95572ef0 agx: Print instructions as "dest = src"
This makes the dataflow easier to read, especially with splits and
collects (which take variable numbers of sources/destinations).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig 72a1e1f33f agx: Emit trap at pack-time, not during isel
This makes the shaderdb stats make more sense.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig 1dcaade3e2 agx: Rename "combine" to "collect"
For consistency with ir3 and bifrost.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig 82e8e709cb agx: Dynamically size split instruction
This is more flexible.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig 7c9fba34bc agx: Switch to dynamic allocation of srcs/dests
So we can handle parallel copies later.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig 544c60a132 agx: Improve printing of immediate sources
For floats, decode the float. Regardless, the size speciifer is
redundant.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig c2bc8c1384 agx: Don't prefix pseudo-ops
It's not really buying us anything and it clutters the IR.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig 40f0ac2082 agx: Emit smaller combines for nir_op_vec2/3
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig 6a183a9ffd agx: Add iterators for phi/non-phi instructions
We know that phi nodes are always at the start (this is asserted in
agx_validate and a fundamental invariant of SSA form). That means we can
cheaply iterate all n phi nodes forward (or n non-phi nodes backwards)
in O(n) time. We already open code this idiom in a few places, use
common iterators instead so we don't need to justify in random places.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18804>
2022-10-14 01:37:39 +00:00
Alyssa Rosenzweig 6689d67603 asahi: Remove no-direct-packing
It's weird.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>
2022-10-13 18:06:52 -04:00
Alyssa Rosenzweig ea58edaafb asahi: Use a header more like Intel's GenXML
We're trying to converge on a common schema.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>
2022-10-13 18:06:52 -04:00
Alyssa Rosenzweig ab2d5deec2 asahi,panfrost: Remove exact attribute
Not used, although in the future it might be...

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>
2022-10-13 18:06:52 -04:00
Alyssa Rosenzweig a64e38b0aa panfrost,asahi: Remove unused function
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>
2022-10-13 18:06:51 -04:00
Alyssa Rosenzweig 0f24c8ef5f panfrost,asahi: Remove unused prepare macro
Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>
2022-10-13 18:06:51 -04:00
Alyssa Rosenzweig 0302519f1c asahi/genxml: Defeature uint/float
Unused, relic from panfrost and not in upstream genxml.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>
2022-10-13 18:06:51 -04:00
Alyssa Rosenzweig 8eefda4ea9 asahi: Eliminate "Pixel Format" type from GenXML
This is leaky and hurts compatibility with upstream GenXML. Just use the
actual hardware fields.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18922>
2022-10-13 18:06:51 -04:00
Alyssa Rosenzweig deb3810f1e agx: Remove load_kernel_input path
Unused and now won't be used.

Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18658>
2022-10-05 16:09:21 +00:00
Alyssa Rosenzweig c17fcbaa2f agx: Account for mask when writing registers
To use fewer registers.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>
2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig 5cd2371318 agx: Pass mask into ld/st_tile instructions
Properly handle render target formats with <4 components.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>
2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig 640fd089a2 agx: Ensure that the optimizer sees legitimate SSA
Expecting it to keep around unused definitions around is wishful. Add an
"anchoring" unit_test instruction to consume the results so they don't
have to be precoloured registers.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>
2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig 52467c2d1e agx: Test fsat+f2f16 together
Something I hit when mucking with this pass.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>
2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig 3e86522cf2 agx: Validate immediates
In particular the new sizing rules.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>
2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig 14f2be1f33 agx: Use 16-bit immediates
This is slightly more accurate in the IR, and means we instruction
select the current 16-bit size floating point instructions when all
non-immediate operands are 16-bit.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>
2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig e302e5d527 agx: Emit fewer combines for intrinsics
A bunch of the emitted combines were unnecessary, or unnecessarily
large. Fix the accounting now that combines are variable size.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>
2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig e887a11b06 agx: Fix bfi_mask packing
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18687>
2022-09-22 03:23:36 +00:00
Alyssa Rosenzweig a1faab0b90 agx: Convert and clamp array indices in NIR
..Rather than at backend IR translation time. This is considerably
simpler because we can use the txs lowering instead of special casing
array sizes. Unfortunately it generates worse code, but that gap should
close once nir_opt_preamble is wired in.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18652>
2022-09-19 16:14:24 +00:00
Alyssa Rosenzweig bcd75a13e0 asahi: Identify shared memory layouts
Somehow maps to the tile size. Not sure about the details yet.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig b8b3c9fa2a asahi: Identify pixel stride
Number of bytes in a pixel in the tilebuffer, does not depend on the
tile size.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig 933a9e350e asahi: Overhaul USC control packing
Break up the monolithic SET_SHADER_EXTENDED packet into the separate
underlying commands (some only 2-byte sized and aligned), and add a
builder for USC control streams like we did for PPP updates to make that
change manageable.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig 35d5558fa5 asahi/genxml: Overflow up to words when packing
So we can pack things that aren't 4-byte sized. Note this doesn't help
with alignment.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig 22d3756207 asahi: Consolidate magic numbers for USC controls
Aka "pipeline" states. It's another command/control stream.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig 09cc736c42 asahi: Identify shared memory fields
For compute kernels, this encodes how much workgroup-local memory is
used ("shared memory" or "threadgroup memory" or "local memory"). This
memory is partitioned by the hardware.

For fragment shaders, this... encodes exactly the same thing. There is
no traditional tilebuffer in AGX, instead local memory is interpreted as
an imageblock, where each workgroup is a tile. This is a nifty design.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig 2fbe1ae09c asahi: Identify spill buffer histogram
Histogram of sizes of the spill buffer, with logarithmic bucket sizes
(relative to the amount spilled from the perspective of a single thread).
Pretty funny.

Also mark a few unknowns that are nonzero when spilling is used.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:37 -04:00
Alyssa Rosenzweig adfd213241 asahi: Decode IOGPU compute header
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:25 -04:00
Alyssa Rosenzweig a9c26df462 asahi: Identify IOGPU compute header
Much simpler than the graphics one.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:25 -04:00
Alyssa Rosenzweig 58d138334d asahi: Shuffle IOGPU structs
We need the header to be common between gfx and compute, but everything
else seems to be different. Shuffle so we can decode compute without any
terrible hacks.

I don't know the exact layout and don't care: the layout of the fields
here is all software defined in macOS, even though the *values* are
defined by hardware (or firmware in a few cases).

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:25 -04:00
Alyssa Rosenzweig 287a0d4f40 asahi: Decode CDM commands separate from VDM
This gets correct handling of CDM stream link/terminate, which are
encoded in a slightly different way from VDM.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:25 -04:00
Alyssa Rosenzweig 4e8a586fd3 asahi: Identify CDM block types
Same enum as PowerVR CDM, annoyingly different from the VDM block types.
Split out the stream link / terminate structs (both observed with Metal
for copious amounts of compute), in preparation for decoding "properly".

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:25 -04:00
Alyssa Rosenzweig 1400733320 asahi: Identify ZLS Control word from PowerVR
We're into the cr.xml file now, which is the blob that gets passed
through the kernel.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:25 -04:00
Alyssa Rosenzweig b0f8639382 asahi: Assert cache line alignment on Z/S buffers
Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18623>
2022-09-18 10:34:25 -04:00
Alyssa Rosenzweig 89e0f54422 agx: Don't use nir_find_variable_with_driver_location
io_semantics is the preferred alternative.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
2022-09-13 16:04:29 +00:00
Alyssa Rosenzweig 0883f0b302 agx: Lower txs to a descriptor crawl
There's no native txs instruction... but we can emulate one :-) This is
heavy on shader ALU, but in the production driver, it'll all be hoisted
up to the preamble shader and so it shouldn't matter much. This
keeps the driver itself simple and low overhead, with a completely
obvious generalization to bindless.

Passes dEQP-GLES3.functional.shaders.texture_functions.texturesize.*

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
2022-09-13 16:04:29 +00:00
Alyssa Rosenzweig bc4f418cb4 agx: Implement load_global(_constant)
Found in compute shaders, maps to a subset of device_load, and will be
used for some lowerings soon.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
2022-09-13 16:04:29 +00:00
Alyssa Rosenzweig 965cc62bdd agx: Implement txd
Handles all cases except for cube maps, which don't seem to work
properly, so those are lowered.

Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18525>
2022-09-13 16:04:29 +00:00