Commit Graph

904 Commits

Author SHA1 Message Date
Alyssa Rosenzweig f29d03a1f9 pan/midgard: Fix corner case in RA
It doesn't really matter but... meh.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig d722b60191 pan/midgard: Add OP_IS_CSEL_V helper
..to distinguish from scalar csel.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig 01316719cf pan/midgard: Expose mir_get/set_swizzle
The scheduler would like to use these.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:24 -07:00
Alyssa Rosenzweig 3f757425a4 pan/midgard: Extract instruction sizing helper
The scheduler shouldn't need to worry about this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:23 -07:00
Alyssa Rosenzweig bbe2914967 pan/midgard: Factor out mir_is_scalar
This helper doesn't need to be in the giant loop.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:23 -07:00
Alyssa Rosenzweig 67909c8ff2 pan/midgard: Count shader-db stats by bundled instructions
This does not affect shaders in any way. Rather, it makes the shader-db
instruction count recorded in the compiler accurate with the in-order
scheduler, matching up with what we calculate from pandecode.

Though shaders are the same, instruction counts cannot be compared
across this commit for this reason.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-30 15:50:22 -07:00
Boris Brezillon 938c5b0148 panfrost: Use ralloc() to allocate instructions to avoid leaking those objs
Instructions attached to blocks are never explicitly freed. Let's
use ralloc() to attach those objects to the compiler context so that
they are automatically freed when the ctx object is freed.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-28 17:50:01 +02:00
Boris Brezillon 2734a4951e Revert "panfrost: Free all block/instruction objects before leaving midgard_compile_shader_nir()"
This reverts commit 5882e0def9.

This commit causes a segfault.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
2019-08-27 20:07:28 +02:00
Boris Brezillon 0142dcb990 panfrost: Make sure bundle.instructions[] contains valid instructions
Add an assert() in schedule_bundle() to make sure all instruction
pointers in bundle.instructions[] are valid.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-27 16:50:52 +02:00
Boris Brezillon 5882e0def9 panfrost: Free all block/instruction objects before leaving midgard_compile_shader_nir()
Right now we're leaking all block and instruction objects allocated by
the compiler. Let's clean things up before leaving
midgard_compile_shader_nir().

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-27 16:50:52 +02:00
Boris Brezillon 3ac49f135a panfrost: Free the instruction object in mir_remove_instruction()
To avoid memory leaks.

Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-27 16:50:52 +02:00
Alyssa Rosenzweig c30116a2fa pan/midgard: Fix invert fusing with r26
The invert wasn't applying (correctly) due to the issues addressed here.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-26 13:43:04 -07:00
Alyssa Rosenzweig 75b6be2435 pan/midgard: Fold ssa_args into midgard_instruction
This is just a bit of refactoring to simplify MIR.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-26 13:43:04 -07:00
Alyssa Rosenzweig 9c328ea66e pan/midgard: Add imov->fmov optimization
When moving constants, if switching to a floating-point representation
doesn't break anything, we'd rather have an fmov than an imov,
permitting inlining the constant in many circumstances.

total quadwords in shared programs: 3408 -> 3366 (-1.23%)
quadwords in affected programs: 1188 -> 1146 (-3.54%)
helped: 41
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.02 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 9.65% x̃: 11.11%
95% mean confidence interval for quadwords value: -1.07 -0.98
95% mean confidence interval for quadwords %-change: -11.38% -7.93%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-26 11:42:33 -07:00
Alyssa Rosenzweig 0acb5c1774 pan/midgard: Switch constants to uint32
Storing constants as float doesn't make sense when we have integer
instructions; better to switch to be integer natively and coerce to/from
float rather than the opposite.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-26 11:42:32 -07:00
Alyssa Rosenzweig 85cc78a624 pan/midgard, bifrost: Set lower_fdph = true
fdph instructions show up in some desktop GL shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-26 07:47:01 -07:00
Alyssa Rosenzweig 20ac0b8e4e pan/midgard: Analyze helper invocations
We check for texture ops which calculate derivatives (either explicitly
via dFd* or implicitly) and mark the shader as requiring helper
invocations.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-23 15:51:25 -07:00
Alyssa Rosenzweig 272ce6f5a7 pan/midgard: Fix writeout combining
shader-db regression in the scheduler.

Fixes: dff4986b1a ("pan/midgard: Emit store_output branch just-in-time")

total bundles in shared programs: 2055 -> 2019 (-1.75%)
bundles in affected programs: 1055 -> 1019 (-3.41%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.35% max: 20.00% x̄: 6.71% x̃: 5.16%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -8.45% -4.97%
Bundles are helped.

total quadwords in shared programs: 3444 -> 3408 (-1.05%)
quadwords in affected programs: 1897 -> 1861 (-1.90%)
helped: 36
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 14.29% x̄: 3.97% x̃: 2.99%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -5.08% -2.86%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 14:03:23 -07:00
Alyssa Rosenzweig f162adc32b pan/midgard: Disassemble integer constants in hex
It's usually easier to parse mentally.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:51:55 -07:00
Alyssa Rosenzweig b89cb0dba6 pan/midgard: Explain ffma
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:51:39 -07:00
Alyssa Rosenzweig 19d58a299b pan/midgard: Analyze simple loads/store
For shaders using exclusively direct attribute/varyings, we can work
this out statically. For shaders with indirect access, we just set an
upper bound of 16 (the max attributes/varyings we support) and the
actual count will be reported regardless.

We proceed similarly for textures/samplers, as well as for UBOs. While
UBOs can be *indexed* indirectly, the *UBO itself* -- which is what we
count in the shader descriptor (rather than the UBO descriptors) -- is
statically determinable.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:51:21 -07:00
Alyssa Rosenzweig a89e368c7f pan/midgard: Compute work_count via writes
This is exact.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:50:57 -07:00
Alyssa Rosenzweig b9fb63859e pan/midgard: Sketch static analysis to uniform count
This one is a little tricky, but the idea is that:

   r16-r23 are always uniforms

   r8-r15 are sometimes work, sometimes uniforms...
      ...but as work, they are always written before use
      ...and as uniforms, they are never written before use

So we use that heuristic to determine the count to feed the machine.
We'll record work register use in the next commit.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:50:40 -07:00
Alyssa Rosenzweig 58fc260312 pan/decode: Hoist shader-db stats to shared decode
We'll want all this information to validate the shader descriptor.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-22 12:50:14 -07:00
Alyssa Rosenzweig 3c01a6928a pan/midgard,bifrost: Expand nir_const_load_to_arr
Panfrost is the only user of the macro; we are better off expanding than
having random stuff in nir.h.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
2019-08-22 12:24:13 -07:00
Alyssa Rosenzweig 7f14916372 pan/midgard: Identify and disassemble indirect texture/sampler
A pair of special flags can turn the texture/sampler handle fields into
register selects. This means code like:

   texture(uTextures[hr28.w], ...)

can be compiled to something like:

   texture ..., fsampler[hr28.w], texture[hr28.w]

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:41:15 -07:00
Alyssa Rosenzweig 8c1bc3c000 pan/midgard: Breakout texture reg select printer
This data structure is shared in other parts of the texture word, so
let's streamline printing.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:41:15 -07:00
Alyssa Rosenzweig 14a2032f0f pan/midgard: Mark fallthrough explicitly
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig d0b9f094fd pan/midgard: Simplify contradictory check.
Coverity.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig 91a5b2657d pan/midgard: Reorder bits check to fix 8-bit masks
Coverity.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig 6189274f57 pan/midgard: Represent unused nodes by ~0
This allows nodes to be unsigned and prevents a class of weird
signedness bugs identified by Coverity.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:31 -07:00
Alyssa Rosenzweig 52ac7dc5d0 pan/midgard: Allocate `dependencies` on stack
It's small; this way we don't leak memory.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:30 -07:00
Alyssa Rosenzweig bf036e127f pan/midgard: Free liveness info
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-21 10:38:30 -07:00
Alyssa Rosenzweig 2bb4dc4054 pan/midgard: Compute liveness per-block
Rather than using a regalloc based on live internals, computed hastily
with repeated invocations of a forward-analysis pass, we switch to
compute liveness information on a per-block basis.

Within a given basic block, we compute liveness backwards with a
linear-time algorithm; for common shaders, this may help RA terminate
quicker.

Across blocks, we use a work list (really a work set) and check if we're
making progress. This isn't terribly efficient, but it gets the job
done. Point is, we get the live_in/live_out for each block.

From there, it's simple to rerun the linear-time update algorithm to
compute the interference graph.

The benefit of this technique is the ability to ignore "gaps" in
liveness across intermediate blocks that are never executed. On simple
shaders like the loops in glmark, this results in a minor reduction in
register pressure. The motivation was a complex shader in Krita that
failed register allocation due to an unfortunate interaction between
texture pipeline registers and control flow. This shader now compiles
successfully.

total instructions in shared programs: 3439 -> 3438 (-0.03%)
instructions in affected programs: 22 -> 21 (-4.55%)
helped: 1
HURT: 0

total bundles in shared programs: 2077 -> 2076 (-0.05%)
bundles in affected programs: 12 -> 11 (-8.33%)
helped: 1
HURT: 0

total quadwords in shared programs: 3457 -> 3456 (-0.03%)
quadwords in affected programs: 20 -> 19 (-5.00%)
helped: 1
HURT: 0

total registers in shared programs: 341 -> 338 (-0.88%)
registers in affected programs: 9 -> 6 (-33.33%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 33.33% max: 33.33% x̄: 33.33% x̃: 33.33%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 24c91bb54b pan/midgard: Analyze load/store for swizzle propagation
If there's a nontrivial swizzle fed into an extra (shortened) argument,
we bail on copyprop. No glmark changes (since it doesn't use fancy
texturing/loads).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 9ae4d3653e pan/midgard: Treat cubemaps "stores" as loads
It's always been ambiguous which they are, but their primary register is
their output, not their input; therefore, they are loads.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 20dd482668 pan/midgard: Clamp cubemap swizzle to XYXX
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 2788721cc4 pan/midgard: Clamp st_vary swizzle by number of components
Same issue with liveness analysis. If we store out a vec3, we should not
reference the .w component.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig edc8e41566 pan/midgard: Use type-appropriate swizzle for texture coordinate
The texture coordinate for a 2D texture could be a vec2 or a vec3,
depending if it's an array texture or not. If it's vec2 (non-array
texture), we should not reference the z component; otherwise, liveness
analysis will get very confused when z is never written.

v2: Fix typo (Ilia).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 2bcb3d9226 pan/midgard: Set mask for lowered read-hazard moves
If we need to lower a move for a read from a vec2 texture coordinate, we
shouldn't write zw, even incidentally.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 739e09c297 pan/midgard: Fix texw lowering with complex control flow
Fixes shaders with control flow like:

   out = 0;

   if (A) {
      if (B)
         out = texture(A, ...)
   } else {
      out = texture(B, ...)
   }

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 6f1c8c148d pan/midgard: Add mir_rewrite_index_dst_single helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig d68019ad1f pan/midgard: Print predecessors in MIR
Just as a sanity check.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig e3a418fe86 pan/midgard: Index blocks for printing
Better than having pointers flying about.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 2f92479ffc pan/midgard: Add mir_foreach_src
This is repeated often enough.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 84580c6dbc pan/midgard: Add mir_foreach_instr_in_block_rev
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig c8c4471a92 pan/midgard: Add mir_foreach_successor helper
Now we should be able to walk the control-flow graph naturally.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig b8e526c520 pan/midgard: Add mir_foreach_predecessor utility
It's ugly, but c'est la vie.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig b4b2e111f8 pan/midgard: Link exit block
The exit block has been 'dangling' in the successors graph, so let's
ensure it's linked in.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 07c960cac0 pan/midgard: Add mir_exit_block helper
The exit block is gauranteed to be empty, signaling the end of the
program.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig aeeeef1242 pan/midgard: Maintain block predecessor set
While we already compute the successors array, for backwards data flow
analysis, it is useful to walk the control flow graph backwards based on
predecessors, so let's compute that information as well.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 4fa09329c1 pan/midgard: Use ralloc on ctx/blocks
This will allow us to get some level of automatic memory management.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig b59b1793b8 pan/midgard: Shrink successors[] to 2 length
A block can't have more.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-19 08:32:17 -07:00
Alyssa Rosenzweig 44a6c38bd6 panfrost: Implement native RECT textures
We started honouring the normalized_coords flag in the texture
descriptor, but a bisection revealed that broke RECT textures -- since
we were *also* lowering them in the shader. So just remove the
shader-based lowering, use native RECT textures, and enjoy the nominal
reduction in complexity and performance boost.

Fixes: 3e47a1181b ("panfrost: Add MALI_SAMP_NORM_COORDS flag")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:42 -07:00
Alyssa Rosenzweig e823a47f02 pan/midgard: Disassemble UBO index explicitly
It's a bit of a special case but that's fine.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig 3d54ed2488 pan/midgard: Account for unaligned UBOs when promoting uniforms
We only know how to promote aligned accesses, although theoretically we
should be able to promote unaligned to swizzles in the future. Check
this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig 03350eb8b8 pan/midgard: Add mir_ubo_shift helper
Different UBO reads have different shift requirements.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig cf3bb10f51 pan/midgard: Address emit_ubo_read offset in bytes
We'll want to be smarter about unaligned reads, so let's get this code
all in one place.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig 65e6cb4eb0 pan/midgard: Wire writemask into UBO reads
Helps the disassembly be clearer and maybe regalloc be smarter.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig ec2f0b580f pan/midgard: Identify UBO/SSBO op symmetry
It's the same thing, just shifted.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:57:24 -07:00
Alyssa Rosenzweig dff4986b1a pan/midgard: Emit store_output branch just-in-time
We'll need multiple branches for MRT, so we can't defer. Also, we need
to track dependencies to ensure r0 is set to the correct value for each
store_output.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig 2fc44c4dc8 pan/midgard: Add dont_eliminate flag
We need to treat fragment writes specially.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 16:42:40 -07:00
Alyssa Rosenzweig 6f4d796911 pan/midgard: Fix disassembly termination condition
Fixes: 863bdd1f8d ("pan/midgard: Break, not return, in disassembler")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig 1ab6290746 pan/midgard: Improve disassembler robustness
Some memory corruption / etc issues let to an accidental "fuzzing" of
the disassembler ;) This uncovered some issues leading to a disassembler
hang, so let's fix that.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 15:09:17 -07:00
Alyssa Rosenzweig 6c84a2665c pan/midgard: Allocate spill_slot once
Multiple spill moves share a single spill slot. Issue found in Krita.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 14:58:34 -07:00
Alyssa Rosenzweig 2a9031ea44 pan/midgard: Use hint on midgard_instruction for spill_move
This allows us to have multiple spill moves, whereas otherwise for N
spill moves, the first N-1 would be clobbered. Issue found in Krita.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 14:58:34 -07:00
Alyssa Rosenzweig c4a4f3db5a pan/midgard: Prefix blobber-db output for grepping
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 10:31:09 -07:00
Alyssa Rosenzweig 5f0f9e1333 pan/midgard: Implement blobber-db
We wire through some shader-db-style stats on the current shader in the
disassemble so we can get a quick estimate of shader complexity from a
trace.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Suggested-by: Rob Clark <robdclark@chromium.org>
2019-08-14 10:31:09 -07:00
Alyssa Rosenzweig 863bdd1f8d pan/midgard: Break, not return, in disassembler
We'll want to dump some stats after the shader, and I refuse to use one
teensy little goto.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-14 10:31:09 -07:00
Alyssa Rosenzweig b1965831e4 pan/midgard: Handle 64-bit address in mir_mask_of_read_components
This is a bit of a hack, but it'll hold us over until we have 64-bit
support wired through.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:03 -07:00
Alyssa Rosenzweig 41e68094f8 pan/midgard: Allocate separate spill indices for lowered moves
This helps RA be slightly more reasonable.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:03 -07:00
Alyssa Rosenzweig 14b5b9ac38 pan/midgard: Extend liveness analysis to trinary ops
Fixes RA fails with multiple indirect SSBO writes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:03 -07:00
Alyssa Rosenzweig c690b37d76 pan/midgard: Fix load/store pairing
This used a delicate hack to try to find indirect inputs and skip them
as candidates for pairing. Let's use a better criterion -- no sources --
and pair based on that.

We could do better, but that would require more complex data flow
analysis than we're interested in doing here.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig 15954ab6ca pan/midgard: Implement nir_intrinsic_load_num_work_groups
Just a sysval to route through.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig 7229af794b pan/midgard: Implement some compute builtins
We implement gl_WorkGroupID and gl_LocalInvocationID, which map to
ld_compute_id with special sources.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig 2b4e579585 pan/midgard: Rename ld_global_id -> ld_compute_id
It's used for more general loads within a compute shader.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:02 -07:00
Alyssa Rosenzweig a5059f2cba pan/midgard: Handle partial writes in liveness analysis
This allows liveness analysis within a loop to be more fine grained,
fixing RA failures with partial spilled movs within a loop, as well as
enabling a slight reduction of register pressure more generally:

total registers in shared programs: 350 -> 347 (-0.86%)
registers in affected programs: 12 -> 9 (-25.00%)
helped: 3
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 25.00% max: 25.00% x̄: 25.00% x̃: 25.00%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig e333bf606f pan/midgard: Dump "no spill"?
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig cc3df917d3 pan/midgard: Absorb nonexistance sources
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig 0a7cc239bd pan/midgard: Pretty-print destinations
They're not "sources" but they follow the same conventions.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig ba8ec19a64 pan/midgard: Pretty-print units
Since we are seeing some use of MIR post-scheduling, let's get this
printed right.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig 73f54f286a pan/midgard: Print mask in dumped MIR
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig 2ec4f9a74b pan/midgard: Add no_spill flag
Hint for the RA to avoid infinite spilling loops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig 7090971f2f pan/midgard: Generalize mir_mask_of_read_components
This now works for load/store and texture instructions as well as ALU.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig 419ddd63b0 pan/midgard: Implement SSBO access
Just laying the groundwork. Reads and writes should be supported (both
direct and indirect, either int or float, vec1/2/3/4), but no bounds
checking is done at the moment.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:01 -07:00
Alyssa Rosenzweig a8639b91b5 pan/midgard: Pipe uniform mask through when spilling
This is a corner case that happens a lot with SSBOs. Basically, if we
only read a few components of a uniform, we need to only spill a few
components or otherwise we try to spill what we spilled and RA hangs.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:43:00 -07:00
Alyssa Rosenzweig 63e240dd05 pan/midgard: Clamp sysval component count
We don't want to load a 128-bit sysval when 64-bits will do. Fixes RA
failures with SSBO indirect writes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig e7ac46be7a pan/midgard: Pass uploaded midgard_instruction through
We want to edit it after emission in some cases.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig fa68740187 pan/midgard: Allow sysval destination override
Sometimes a sysval is used to facilitate an instruction but is not the
instruction itself.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig 2efa025b05 panfrost: Add SSBO system value
For each SSBO index we get from Gallium/NIR, we need two pieces of
information in the shader:

1. The address of the SSBO in GPU memory. Within the shader, we'll be
accessing it with raw memory load/store, so we need the actual address,
not just an index.

2. The size of the SSBO. This is not strictly necessary, but at some
point, we may like to do bounds checking on SSBO accesses.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-12 12:42:59 -07:00
Alyssa Rosenzweig c82672c9c1 pan/midgard: Account for swizzle/mask in st_vary
Register allocation for varying stores is a bit different, since the
instructions ignore the writemask (varyings are normalized
packed/vectorized..)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-09 11:50:45 -07:00
Alyssa Rosenzweig 5a898e2a65 pan/midgard: Disassemble load/store barrel shift
Arm assembly intensifies.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-08 15:49:12 -07:00
Alyssa Rosenzweig 3db4949197 pan/midgard: Extend SSA concurrency checks to other args
No glmark changes, but this seems like a good idea.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-05 11:22:49 -07:00
Alyssa Rosenzweig 2869758355 pan/midgard: Rewrite bidirectionally when eliminating moves
Symptom: the sky is black in SuperTuxKart (flashbacks to SMB/NES
emulation intensify).

Essentially, what happened is a fixed (special) move to r0 was
eliminated but scheduling did not factor this in, so
can_run_concurrent_ssa returned true even when there was a logical data
dependency that needed to be resolved.

Fixes: 20771ede1c ("pan/midgard: Add post-RA move elimination")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-05 10:58:39 -07:00
Alyssa Rosenzweig 8ddb38209d pan/midgard: Print texture outmod
I have no idea who thought this was a good idea.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 16:54:53 -07:00
Alyssa Rosenzweig ad864a0bbb pan/midgard: Promote all 16 uniforms
Now that register spilling is in place, this is reasonable. It turns out
for some shaders, it's actually better to cap at 8 work registers and
extra >8 uniform reigsters and tolerate the spilling, since the extra
resulting threads make up for the spillage. So incidentally, the shader
that spills here is in -bterrain, which jumps from 19fps to 21fps as a
result of this change.

total instructions in shared programs: 3513 -> 3448 (-1.85%)
instructions in affected programs: 776 -> 711 (-8.38%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 8 x̄: 3.25 x̃: 2
helped stats (rel) min: 3.57% max: 16.00% x̄: 8.37% x̃: 7.19%
95% mean confidence interval for instructions value: -4.28 -2.22
95% mean confidence interval for instructions %-change: -10.02% -6.73%
Instructions are helped.

total bundles in shared programs: 2067 -> 2024 (-2.08%)
bundles in affected programs: 515 -> 472 (-8.35%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 6 x̄: 2.37 x̃: 2
helped stats (rel) min: 2.13% max: 17.86% x̄: 10.19% x̃: 11.11%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 3.23% max: 3.23% x̄: 3.23% x̃: 3.23%
95% mean confidence interval for bundles value: -3.01 -1.29
95% mean confidence interval for bundles %-change: -12.13% -6.91%
Bundles are helped.

total quadwords in shared programs: 3468 -> 3426 (-1.21%)
quadwords in affected programs: 764 -> 722 (-5.50%)
helped: 19
HURT: 1
helped stats (abs) min: 1 max: 5 x̄: 2.26 x̃: 2
helped stats (rel) min: 1.41% max: 12.50% x̄: 6.76% x̃: 7.14%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.08% max: 1.08% x̄: 1.08% x̃: 1.08%
95% mean confidence interval for quadwords value: -2.83 -1.37
95% mean confidence interval for quadwords %-change: -8.08% -4.65%
Quadwords are helped.

total registers in shared programs: 383 -> 360 (-6.01%)
registers in affected programs: 112 -> 89 (-20.54%)
helped: 19
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 1.21 x̃: 1
helped stats (rel) min: 12.50% max: 27.27% x̄: 20.63% x̃: 20.00%
95% mean confidence interval for registers value: -1.47 -0.95
95% mean confidence interval for registers %-change: -22.39% -18.87%
Registers are helped.

total threads in shared programs: 432 -> 451 (4.40%)
threads in affected programs: 19 -> 38 (100.00%)
helped: 11
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.73 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%
95% mean confidence interval for threads value: 1.41 2.04
95% mean confidence interval for threads %-change: 100.00% 100.00%
Threads are [helped].

total loops in shared programs: 4 -> 4 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0

total spills in shared programs: 0 -> 4
spills in affected programs: 0 -> 4
helped: 0
HURT: 2

total fills in shared programs: 0 -> 7
fills in affected programs: 0 -> 7
helped: 0
HURT: 2

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 16:52:21 -07:00
Alyssa Rosenzweig e94239b9a4 pan/midgard: Break mir_spill_register into its function
No functional changes, just breaks out a megamonster function and fixes
the indentation.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 16:52:21 -07:00
Alyssa Rosenzweig d4bcca19da pan/midgard: Switch sources to an array for trinary sources
We need three independent sources to support indirect SSBO writes (as
well as textures with both LOD/bias and offsets). Now is a good time to
make sources just an array so we don't have to rewrite a ton of code if
we ever needed a fourth source for some reason.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 16:48:54 -07:00
Alyssa Rosenzweig 513d02cfeb pan/midgard: Remove "r27-only" register class
As far as I know, there's no such thing as a load/store op that only
takes its argument in r27. We just need to set the appropriate arg_1
field in the RA to specify other registers if we want them.

To facilitate this, various RA-related changes are needed across the
compiler ; this should also fix indirect offsets which were implicitly
interpreted as "r27-only" despite not even passing through RA yet. One
ripple effect change is switching the move insertion point and adjusting
the liveness analysis accordingly, so while this was intended as a
purely functional change, there are some shader-db changes:

total instructions in shared programs: 3511 -> 3498 (-0.37%)
instructions in affected programs: 563 -> 550 (-2.31%)
helped: 12
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.08 x̃: 1
helped stats (rel) min: 0.93% max: 5.00% x̄: 2.58% x̃: 2.33%
95% mean confidence interval for instructions value: -1.27 -0.90
95% mean confidence interval for instructions %-change: -3.23% -1.93%
Instructions are helped.

total bundles in shared programs: 2067 -> 2067 (0.00%)
bundles in affected programs: 398 -> 398 (0.00%)
helped: 7
HURT: 4
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.54% max: 10.00% x̄: 5.04% x̃: 5.56%
HURT stats (abs)   min: 1 max: 2 x̄: 1.75 x̃: 2
HURT stats (rel)   min: 2.13% max: 4.26% x̄: 3.72% x̃: 4.26%
95% mean confidence interval for bundles value: -0.95 0.95
95% mean confidence interval for bundles %-change: -5.21% 1.50%
Inconclusive result (value mean confidence interval includes 0).

total quadwords in shared programs: 3464 -> 3454 (-0.29%)
quadwords in affected programs: 1199 -> 1189 (-0.83%)
helped: 18
HURT: 4
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.03% max: 5.26% x̄: 2.44% x̃: 1.79%
HURT stats (abs)   min: 2 max: 2 x̄: 2.00 x̃: 2
HURT stats (rel)   min: 2.56% max: 2.82% x̄: 2.63% x̃: 2.56%
95% mean confidence interval for quadwords value: -0.98 0.07
Inconclusive result (value mean confidence interval includes 0).

total registers in shared programs: 383 -> 373 (-2.61%)
registers in affected programs: 56 -> 46 (-17.86%)
helped: 12
HURT: 2
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 9.09% max: 33.33% x̄: 29.58% x̃: 33.33%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 20.00% max: 50.00% x̄: 35.00% x̃: 35.00%
95% mean confidence interval for registers value: -1.13 -0.29
95% mean confidence interval for registers %-change: -35.07% -5.63%
Registers are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig 5d9b7a8ddb pan/midgard: Handle get/set_swizzle for load/store arguments
Load/store's  main "argument 0" already has its swizzle handled
correctly (for stores, that is). But the tinier arguments, the compact
ones with a component select but not a full swizzle, those are not yet
handled. Let's do something about that!
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig 9aeb726045 pan/midgard: Fix block successors
Rather than an ersatz thing that sort of looks like successors but is in
fact just the source order traversal with some backward jumps hacked in
for loops... construct an actual flow graph so we can do analysis
sanely.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig 1a116037d8 pan/midgard: Add helper to pack load/store registers
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig e112d9d333 pan/midgard: Decode register/component in load/store argument
3-bits out of 8 down!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig 5a572f4b55 pan/midgard: Fix REGISTER_OFFSET
r27 isn't the special one, usually.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:03 -07:00
Alyssa Rosenzweig c908772ee4 pan/midgard: Split ld/st unknown to arg_1/arg_2 fields
The 16-bit field can be decomposed to two independent 8-bit fields, each
representing a single (additional) argument to the load/store op,
generally used for encoding registers. Addressable registers here are
substantially limited compared to the main register in a load/store op.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 14:20:02 -07:00
Alyssa Rosenzweig 1637a53890 pan/midgard: Print invert modifier
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig 62a5ee3bb4 pan/midgard: Flip conditionals
We would like to flip ops to have a constant in the second place to
enable inlining of the constant.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig d066ca3575 pan/midgard: Add bitwise src/invert fusing
De Morgan's Laws and some special ops basically.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig 620c2717cf pan/midgard: Add .not propagation pass
Essentially .pos propagation but for bitwise.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig b821e1b85e pan/midgard: Fuse invert into bitwise ops
We use the new invert flag to produce ops like inand.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-02 09:57:15 -07:00
Alyssa Rosenzweig 73c40d6bbb pan/midgard: Use standard list traversal to find initial tag
Fixes a hang (and abort) on empty shaders, which you shouldn't have
anyway but better safe than sorry. DCE going on the fritz is no reason
to freeze the system.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-08-01 16:23:03 -07:00
Alyssa Rosenzweig 91c4acedaf pan/midgard: Don't special case inline_constant
Another constant source of bugs. Ain't that special.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-31 10:59:19 -07:00
Alyssa Rosenzweig 29416a8599 pan/midgard: De-special-case branching
It's not that special.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-31 10:59:18 -07:00
Alyssa Rosenzweig 194b49ee28 panfrost: Flip texture/sampler fields
We had them backwards in both the command stream and the Midgard stack.
In OpenGL ES 2.0, they're always the same, but in Vulkan/later-GL/CL
they diverge so we can fix this.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-31 10:56:11 -07:00
Alyssa Rosenzweig 7f75b2b5af pan/midgard: Simplify discard logic
The "branch offset" is, in fact, ignored.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-31 09:39:16 -07:00
Alyssa Rosenzweig 27524d1462 pan/midgard: Add units for more instructions
For everything but freduce, we have some sense of what units the
instruction takes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-31 09:39:16 -07:00
Alyssa Rosenzweig 64235b1ecc pan/midgard: Fix ball/bany opcode table
This were seriously messed up beyond all recognition. How we're passing
shaders.random.* is a mystery.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-31 09:39:16 -07:00
Alyssa Rosenzweig 13ee87c8b9 pan/midgard: Document branch combination LUT
This took way longer to figure out than it should have..

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-31 09:39:16 -07:00
Alyssa Rosenzweig a3c59f9f00 pan/midgard: Nothing to see here, move along folks
Fixes: dee1e18fe4 ("pan/midgard: Cleanup ops table")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:49:13 -07:00
Alyssa Rosenzweig dee1e18fe4 pan/midgard: Cleanup ops table
Hopefully this should make a few ops make more sense. No functional
changes.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:35:22 -07:00
Alyssa Rosenzweig 834aeb1e52 pan/midgard: Extend copy-propagation to swizzles
We can compose them when we rewrite, which is.. more code.. but helps.

total instructions in shared programs: 3611 -> 3513 (-2.71%)
instructions in affected programs: 672 -> 574 (-14.58%)
helped: 11
HURT: 2
helped stats (abs) min: 2 max: 14 x̄: 9.09 x̃: 10
helped stats (rel) min: 5.71% max: 24.56% x̄: 17.99% x̃: 18.87%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 1.19% max: 2.08% x̄: 1.64% x̃: 1.64%
95% mean confidence interval for instructions value: -10.45 -4.62
95% mean confidence interval for instructions %-change: -20.07% -9.87%
Instructions are helped.

total bundles in shared programs: 2117 -> 2067 (-2.36%)
bundles in affected programs: 356 -> 306 (-14.04%)
helped: 11
HURT: 0
helped stats (abs) min: 1 max: 7 x̄: 4.55 x̃: 5
helped stats (rel) min: 4.55% max: 15.22% x̄: 13.63% x̃: 14.71%
95% mean confidence interval for bundles value: -5.64 -3.45
95% mean confidence interval for bundles %-change: -15.71% -11.55%
Bundles are helped.

total quadwords in shared programs: 3567 -> 3468 (-2.78%)
quadwords in affected programs: 695 -> 596 (-14.24%)
helped: 11
HURT: 1
helped stats (abs) min: 2 max: 14 x̄: 9.09 x̃: 10
helped stats (rel) min: 5.56% max: 21.88% x̄: 14.97% x̃: 15.15%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 2.38% max: 2.38% x̄: 2.38% x̃: 2.38%
95% mean confidence interval for quadwords value: -10.96 -5.54
95% mean confidence interval for quadwords %-change: -17.42% -9.63%
Quadwords are helped.

total registers in shared programs: 391 -> 383 (-2.05%)
registers in affected programs: 46 -> 38 (-17.39%)
helped: 9
HURT: 1
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 25.00% max: 25.00% x̄: 25.00% x̃: 25.00%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 10.00% max: 10.00% x̄: 10.00% x̃: 10.00%
95% mean confidence interval for registers value: -1.25 -0.35
95% mean confidence interval for registers %-change: -29.42% -13.58%
Registers are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:35:10 -07:00
Alyssa Rosenzweig c45487b770 pan/midgard: Extract simple source mod check
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:35:09 -07:00
Alyssa Rosenzweig 2d2abb08d0 pan/midgard: Lower texr/texw mixed registers
Conceptually, r28-r29 (as used for reading) and r28-r29 (as used for
writing) aren't registers at all, merely push/pull arrangements. So you
can't feed a texture result back into itself without explicitly moving
in the middle.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:01:20 -07:00
Alyssa Rosenzweig 2b248af43e pan/midgard: Always set .cont for derivatives in loops
We need to keep the helper invocations alive.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig 8f887329c0 pan/midgard: Implement derivatives
Implement the fdd* and fdd* opcodes in the Midgard compiler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig 982134d22e pan/midgard: Compose original texture swizzle in RA
Used for lowering derivatives.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig 79875a9a64 pan/midgard: Add new swizzles
Used for derivatives.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig 81e7782e30 pan/midgard: Add OP_IS_DERIVATIVE helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig ae6aea0d98 pan/midgard: Add make_compiler_temp_reg helper
Corrollary to make_compiler_temp (for SSA).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig 30b15a830a pan/midgard: Move nir_*_src_index to compiler.h
These helpers are useful for code emission everywhere. Share the love!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig c9498b3c5e pan/midgard: Disassemble unknown texture ops as hex
I'm not sure why I ever thought decimal was a good idea.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig 0714481894 pan/midgard: Add support for disassembling derivatives
They're just texture ops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-30 10:01:19 -07:00
Alyssa Rosenzweig 463164b325 pan/midgard: Fix alpha test w.r.t new indexing
Fixes: 9beb3391b5 ("pan/midgard: Tag SSA/reg")

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-29 08:31:03 -07:00
Alyssa Rosenzweig 159abd527e pan/midgard: Introduce invert field
This will enable us to fuse inverts in various ways. Marginal hurt:

total instructions in shared programs: 3610 -> 3611 (0.03%)
instructions in affected programs: 67 -> 68 (1.49%)
helped: 0
HURT: 1

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 13:38:41 -07:00
Alyssa Rosenzweig 9beb3391b5 pan/midgard: Tag SSA/reg
Rather than putting registers after SSA in the MIR indexing, put them
side-by-side, shifted 1, using the bottom bit as the SSA/reg select.
This will allow us to generate SSA temps in the compiler.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 13:38:41 -07:00
Alyssa Rosenzweig f8c71d7632 pan/midgard: Improve scheduling
Make scalar scheduling onto vector units more aggressive (it can only
help while we schedule strictly in order). Also, allow imov on VLUT.

total bundles in shared programs: 2176 -> 2117 (-2.71%)
bundles in affected programs: 901 -> 842 (-6.55%)
helped: 24
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 2.46 x̃: 2
helped stats (rel) min: 2.08% max: 20.00% x̄: 8.68% x̃: 5.94%
95% mean confidence interval for bundles value: -3.93 -0.99
95% mean confidence interval for bundles %-change: -10.92% -6.45%
Bundles are helped.

total quadwords in shared programs: 3605 -> 3566 (-1.08%)
quadwords in affected programs: 1984 -> 1945 (-1.97%)
helped: 28
HURT: 5
helped stats (abs) min: 1 max: 3 x̄: 1.68 x̃: 2
helped stats (rel) min: 1.02% max: 14.29% x̄: 5.12% x̃: 2.94%
HURT stats (abs)   min: 1 max: 3 x̄: 1.60 x̃: 1
HURT stats (rel)   min: 0.57% max: 9.09% x̄: 6.40% x̃: 9.09%
95% mean confidence interval for quadwords value: -1.67 -0.69
95% mean confidence interval for quadwords %-change: -5.37% -1.37%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 10:28:46 -07:00
Alyssa Rosenzweig 94e281b9e0 pan/midgard: Specialize mod checking by type when checking constants
Fixes inlining of integer constants.

total quadwords in shared programs: 3585 -> 3568 (-0.47%)
quadwords in affected programs: 625 -> 608 (-2.72%)
helped: 13
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.31 x̃: 1
helped stats (rel) min: 1.27% max: 9.52% x̄: 3.84% x̃: 2.94%
95% mean confidence interval for quadwords value: -1.60 -1.02
95% mean confidence interval for quadwords %-change: -5.60% -2.07%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 09:47:40 -07:00
Alyssa Rosenzweig e823d33e77 pan/midgard: Use more aggressive writeout criteria
We loosen the requirement of "no dependencies" to simply be "no
non-pipelined dependencies", so we check for what could be pipelined.

total bundles in shared programs: 2176 -> 2156 (-0.92%)
bundles in affected programs: 779 -> 759 (-2.57%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.33% max: 20.00% x̄: 6.47% x̃: 2.78%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -9.44% -3.50%
Bundles are helped.

total quadwords in shared programs: 3605 -> 3585 (-0.55%)
quadwords in affected programs: 1391 -> 1371 (-1.44%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 14.29% x̄: 3.84% x̃: 1.64%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -5.73% -1.94%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 09:47:40 -07:00
Alyssa Rosenzweig c7fc5f3567 pan/midgard: Pipeline non-SSA registers
Rather than bailing if we see something that's not SSA, do out the
analysis to check if we can pipeline and do so if we can.

total registers in shared programs: 392 -> 391 (-0.26%)
registers in affected programs: 3 -> 2 (-33.33%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 09:40:10 -07:00
Alyssa Rosenzweig 79f0896491 pan/midgard: Add mir_mask_of_read_components helper
This facilitates analysis of vec4 registers (after going out-of-SSA).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 09:37:28 -07:00
Alyssa Rosenzweig 481447cb00 pan/midgard: Add mir_is_written_before helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 09:20:52 -07:00
Alyssa Rosenzweig 95732cc9ef pan/midgard: Obey fragment writeout criteria
Rather than always emitting an extra move for fragments, check the
actual criteria and emit accordingly. (This was lost during the RA
improvements at the end of May).

total bundles in shared programs: 2210 -> 2176 (-1.54%)
bundles in affected programs: 501 -> 467 (-6.79%)
helped: 34
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 1.59% max: 33.33% x̄: 13.13% x̃: 12.50%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -16.06% -10.21%
Bundles are helped.

total quadwords in shared programs: 3639 -> 3605 (-0.93%)
quadwords in affected programs: 795 -> 761 (-4.28%)
helped: 34
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.96% max: 33.33% x̄: 11.22% x̃: 8.33%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -14.31% -8.13%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:09 -07:00
Alyssa Rosenzweig 20771ede1c pan/midgard: Add post-RA move elimination
Think of this pass as register coalescing part 2. After RA runs, but
before scheduling, we scan for code of the form:

   mov rN, rN

and delete the move, since it's totally redundant. This pass helps
already, but it'd of course be much more effective paired with
register coalescing to encourage moves in general to end up in this
form. Nevertheless, even by itself:

total instructions in shared programs: 3665 -> 3613 (-1.42%)
instructions in affected programs: 2046 -> 1994 (-2.54%)
helped: 52
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 8.02% x̃: 4.00%
95% mean confidence interval for instructions value: -1.00 -1.00
95% mean confidence interval for instructions %-change: -10.26% -5.79%
Instructions are helped.

total bundles in shared programs: 2256 -> 2213 (-1.91%)
bundles in affected programs: 1154 -> 1111 (-3.73%)
helped: 43
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.33% max: 25.00% x̄: 9.10% x̃: 5.56%
95% mean confidence interval for bundles value: -1.00 -1.00
95% mean confidence interval for bundles %-change: -11.60% -6.60%
Bundles are helped.

total quadwords in shared programs: 3689 -> 3642 (-1.27%)
quadwords in affected programs: 2025 -> 1978 (-2.32%)
helped: 47
HURT: 0
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 0.19% max: 25.00% x̄: 7.86% x̃: 3.85%
95% mean confidence interval for quadwords value: -1.00 -1.00
95% mean confidence interval for quadwords %-change: -10.30% -5.42%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:09 -07:00
Alyssa Rosenzweig cb6dea6b4d pan/midgard: Share mir_nontrivial_outmod
To be used with redundant move elimination.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig b6946d35c8 pan/midgard: Implement texture RA
total instructions in shared programs: 3916 -> 3665 (-6.41%)
instructions in affected programs: 1405 -> 1154 (-17.86%)
helped: 35
HURT: 0
helped stats (abs) min: 1 max: 21 x̄: 7.17 x̃: 3
helped stats (rel) min: 3.00% max: 28.57% x̄: 20.11% x̃: 21.74%
95% mean confidence interval for instructions value: -9.35 -4.99
95% mean confidence interval for instructions %-change: -22.75% -17.46%
Instructions are helped.

total bundles in shared programs: 2472 -> 2256 (-8.74%)
bundles in affected programs: 906 -> 690 (-23.84%)
helped: 32
HURT: 0
helped stats (abs) min: 1 max: 18 x̄: 6.75 x̃: 3
helped stats (rel) min: 5.56% max: 32.26% x̄: 20.83% x̃: 16.67%
95% mean confidence interval for bundles value: -9.09 -4.41
95% mean confidence interval for bundles %-change: -23.77% -17.89%
Bundles are helped.

total quadwords in shared programs: 3965 -> 3689 (-6.96%)
quadwords in affected programs: 1568 -> 1292 (-17.60%)
helped: 35
HURT: 0
helped stats (abs) min: 1 max: 21 x̄: 7.89 x̃: 3
helped stats (rel) min: 2.08% max: 28.57% x̄: 19.87% x̃: 20.00%
95% mean confidence interval for quadwords value: -10.38 -5.39
95% mean confidence interval for quadwords %-change: -22.57% -17.17%
Quadwords are helped.

total registers in shared programs: 411 -> 392 (-4.62%)
registers in affected programs: 76 -> 57 (-25.00%)
helped: 15
HURT: 0
helped stats (abs) min: 1 max: 2 x̄: 1.27 x̃: 1
helped stats (rel) min: 9.09% max: 50.00% x̄: 30.97% x̃: 33.33%
95% mean confidence interval for registers value: -1.52 -1.01
95% mean confidence interval for registers %-change: -39.12% -22.82%
Registers are helped.

total threads in shared programs: 426 -> 432 (1.41%)
threads in affected programs: 6 -> 12 (100.00%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00%

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig 13f61f24ea pan/midgard: Fix backwards blend color load
The source and destination were incorrectly flipped in the move, but
some details of our internal regalloc made this function anyway. Now
that we're changing the regalloc, we need to fix this to avoid
regressing blend shaders.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig a99ecc2b2b pan/midgard: Fix scheduling mishap
We shouldn't try to schedule onto a vmul if the last unit was a smul;
that would force a break ("traveling back in time").

total bundles in shared programs: 2519 -> 2472 (-1.87%)
bundles in affected programs: 791 -> 744 (-5.94%)
helped: 20
HURT: 0
helped stats (abs) min: 1 max: 9 x̄: 2.35 x̃: 1
helped stats (rel) min: 1.52% max: 11.76% x̄: 7.94% x̃: 7.69%
95% mean confidence interval for bundles value: -3.47 -1.23
95% mean confidence interval for bundles %-change: -9.36% -6.51%
Bundles are helped.

total quadwords in shared programs: 4028 -> 3965 (-1.56%)
quadwords in affected programs: 1223 -> 1160 (-5.15%)
helped: 17
HURT: 0
helped stats (abs) min: 1 max: 17 x̄: 3.71 x̃: 2
helped stats (rel) min: 2.97% max: 10.64% x̄: 6.97% x̃: 7.14%
95% mean confidence interval for quadwords value: -5.71 -1.70
95% mean confidence interval for quadwords %-change: -8.03% -5.91%
Quadwords are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig e4038f9445 pan/midgard: Fix vector->scalar swizzles
The swizzle should be taken on the masked component, rather than
unconditionally X.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig 10324095d2 pan/midgard: Add dead move elimination pass
This is a special case of DCE designed to run after the out-of-ssa pass
to cleanup special register lowering.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig 082485d663 pan/midgard: Move DCE into its own file
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig f9e619fa82 pan/midgard: Add mir_rewrite_dst_tag helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig b3cab85606 pan/midgard: Fix flipped register bias fields
We mixed up component_lo and full, which made it appear that we had
less freedom in RA than we actually do. Fix this to fix some
disassemblies as well as prepare for RA with the bias field.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig be56840d5a pan/midgard: Update RA for cubemap coords
Following the RA work, we apply the same technique to eliminate the move
to r27 when loading cubemaps.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-26 08:37:08 -07:00
Alyssa Rosenzweig 2f9236096a Revert "panfrost: Don't DIY point size/coord fields"
This reverts commit 4508f43eed, which
broke a bunch of dEQP tests (e.g. in
dEQP-GLES2.functional.draw.draw_arrays.*)

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 13:17:22 -07:00
Alyssa Rosenzweig 9ce75826cb pan/midgard: Optimize varying projection
We add a new opt pass fusing perspective projection with varyings. Minor
win..? We don't combine non-varying projections, since if we're too
agressive, the extra load/store traffic will hurt us so it's not really
a win in practice.

total instructions in shared programs: 3915 -> 3913 (-0.05%)
instructions in affected programs: 76 -> 74 (-2.63%)
helped: 1
HURT: 0

total bundles in shared programs: 2520 -> 2519 (-0.04%)
bundles in affected programs: 46 -> 45 (-2.17%)
helped: 1
HURT: 0

total quadwords in shared programs: 4027 -> 4025 (-0.05%)
quadwords in affected programs: 80 -> 78 (-2.50%)
helped: 1
HURT: 0

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig f6438d1e15 pan/midgard: Add perspective projection recombine pass
We don't use it yet, since it's actually a shader-db regression. This is
primarily helpful as an intermediate step for attaching projection to
varyings.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig 8ddb0eda42 pan/midgard: Force perspective ops to use vec4
It doesn't make sense to use them with anything less.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig b06951d343 pan/midgard: Add R27-only op handling
We use a special conflicting register class.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig f55a760d0c pan/midgard: Add OP_R27_ONLY helper
While load/store ops like st_vary can take an argument in either
r26/r27, ops like those for perspective projection must specifically
take their argument in r27.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig 233c0faadd pan/midgard: Enable RA for st_vary
Now that all the piping is in place to do so without regressions, we
flip on automatic register allocation for varyings. Hooray!

total instructions in shared programs: 4025 -> 3915 (-2.73%)
instructions in affected programs: 1667 -> 1557 (-6.60%)
helped: 62
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 1.77 x̃: 2
helped stats (rel) min: 0.93% max: 20.00% x̄: 10.80% x̃: 10.64%
95% mean confidence interval for instructions value: -1.89 -1.66
95% mean confidence interval for instructions %-change: -12.50% -9.11%
Instructions are helped.

total bundles in shared programs: 2683 -> 2520 (-6.08%)
bundles in affected programs: 1066 -> 903 (-15.29%)
helped: 62
HURT: 0
helped stats (abs) min: 1 max: 3 x̄: 2.63 x̃: 3
helped stats (rel) min: 2.94% max: 42.86% x̄: 23.85% x̃: 22.50%
95% mean confidence interval for bundles value: -2.83 -2.43
95% mean confidence interval for bundles %-change: -27.73% -19.97%
Bundles are helped.

total quadwords in shared programs: 4192 -> 4027 (-3.94%)
quadwords in affected programs: 1584 -> 1419 (-10.42%)
helped: 62
HURT: 0
helped stats (abs) min: 1 max: 4 x̄: 2.66 x̃: 3
helped stats (rel) min: 1.85% max: 30.00% x̄: 16.49% x̃: 16.52%
95% mean confidence interval for quadwords value: -2.87 -2.46
95% mean confidence interval for quadwords %-change: -19.14% -13.84%
Quadwords are helped.

total registers in shared programs: 433 -> 411 (-5.08%)
registers in affected programs: 67 -> 45 (-32.84%)
helped: 23
HURT: 1
helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1
helped stats (rel) min: 25.00% max: 50.00% x̄: 41.30% x̃: 50.00%
HURT stats (abs)   min: 1 max: 1 x̄: 1.00 x̃: 1
HURT stats (rel)   min: 14.29% max: 14.29% x̄: 14.29% x̃: 14.29%
95% mean confidence interval for registers value: -1.09 -0.74
95% mean confidence interval for registers %-change: -45.45% -32.52%
Registers are helped.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig 210dbe3fc1 pan/midgard: Remove check for `class`
Fixes classes defaulting to vec4 in some cases.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig 8842db3a7d pan/midgard: Move uniforms to special registers
The load/store pipes can't take a uniform register in, so an explicit
move is necessary here.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig ae7acde91f pan/midgard: Emit st_vary registers in install_registers
Now that we have its registers handled normally like the rest of the IR.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig c3ad7500d2 pan/midgard: Add mir_lower_special_reads helper
Given the constraints on special registers, we add a helper for lowering
these by inserting moves (copies) where needed to satsify the ISA
constraints.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig e169301bd8 pan/midgard: Add emit_explicit_constant helper
We generalize the constant emission helper used in fragment writeout as
we'll also need it for vertex outputs.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig eedd6c1dd0 pan/midgard: Add mir_rewrite_index_src_tag
Specialized version of a rewrite that only rewrites a certain type of
instruction.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig 5d5caf10af pan/midgard: Add class check
This ensures the rules for accessing special register classes are
satisfied. This is asserted as a prepass should have lowered offending
uses to something satisfying these rules. Special register classes are
*not* work registers and cannot be used for RMW operations; they are
essentially 1-way pipes straight into/from fixed-function logic in the
shader cores.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig 91195bdff1 pan/midgard: Implement class spilling
We reuse the same register spilling mechanism as for work->memory to
spill special->work registers, e.g. to allow writing out more than 2
vec4 varyings (without better scheduling anyway).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig 0f38f6466e pan/midgard: Extend liveness analysis to st_vary
These can consume sources now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig dca0166ce1 pan/midgard: Implement load/store register classing
This does not yet support special->work spilling, nor does it support
multiclass breakup. These corner cases will be handled in succeeding
commits.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig 839b80aa89 pan/midgard: Allocate special register classes
We'll want to also handle load/store and texture registers in our RA
loop.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig 480b502443 pan/midgard: Move copy propagation into its own file
We also expose some utilities it uses as general MIR helpers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:22 -07:00
Alyssa Rosenzweig b8caaa3000 pan/midgard: Add mir_simple_swizzle helper
Checks for x/xy/xyz/xyzw style swizzles (slightly more general but you
get the idea).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:21 -07:00
Alyssa Rosenzweig 63385a3fdb pan/midgard: Add mir_single_use helper
Helps as an optimization heuristic.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:37:21 -07:00
Alyssa Rosenzweig 5534fdb7bf panfrost: Compute I/O counts from shader_info
...rather than exposing it in the vendored compiler region.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig 4508f43eed panfrost: Don't DIY point size/coord fields
Again, it's in shader_info for us!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig bab4f6c724 panfrost: Use nir_gather_info information about discards
No need to track this ourselves!

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-25 06:34:21 -07:00
Alyssa Rosenzweig 840b806d64 panfrost/midgard: Allocate registers once (per-screen)
This should save a lot of per-compile time by using the RA the way it's
actually supposed to be used.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-23 09:06:21 -07:00
Alyssa Rosenzweig e8dca7e1e1 pan/midgard: Report spills:fills to shader-db
Route this info through so we can track how we're doing on register
spilling.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 055aa9b1f4 panfrost/midgard: Reenable pipeline register creation
This was disabled to permit regression-free RA work. Now that the spill
code is in place, we can reenable, with some caveats about efficacy.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig f0d0061b18 panfrost/midgard: Report tls_size
Pipe through the number of bytes of spilled memory used from the
compiler into the main driver, where it will be used to allocate the
Thread Local Storage buffer.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig bc741599f2 panfrost/midgard: Promote to *move*, not rewrite for non-SSA
Fixes promoted uniform loads to registers.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 40abf11708 panfrost/midgard: Dump MIR of RA failure
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig a08e9511e3 pan/midgard; Dump successor graph when printing MIR
We just use the pointers of the midgard_block*, which is crude, but it
gets the point across and will help debug successor related issues.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 1aa556de2e pan/midgard: Remove debug statement
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 21510c253c panfrost/midgard: Implement register spilling
Now that we run RA in a loop, before each iteration after a failed
allocation we choose a spill node and spill it to Thread Local Storage
using st_int4/ld_int4 instructions (for spills and fills respectively).

This allows us to compile complex shaders that normally would not fit
within the 16 work register limits, although it comes at a fairly steep
performance penalty.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 533d65786f panfrost/midgard: Add mir_has_arg helper
Helps scan the MIR for uses of an index.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 076838ef0c panfrost/midgard: Check write-before-read in liveness analysis
If we write to an index before reading it, the old copy we're checking
liveness for isn't live in this block, even if it does get read later.
Fixes abnormally high register pressure in shaders with loops.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 997f85c136 panfrost/midgard/disasm: Check for certain tag errors
Midgard bundles contain a tag, as well as a copy of the tag of the next
bundle to facilitate prefetch. Do some simple static analysis to detect
certain tag errors (particularly on shaders without branching).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig d168b08d62 pan/midgard: Add OP_IS_CSEL helper
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 1f297471a0 pan/midgard: Add mir_rewrite_index_src_single helper
Rather than rewriting an index away across the whole block, we expose
finer (per-instruction) granularity for rewrites.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 16c8c354d0 pan/midgard: Ignore inline_constant in liveness
It doesn't make any sense to look at it.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig d155168e6c panfrost/midgard: Implement load/store scratch opcodes
These are used to load/store from Thread Local Storage, which is memory
allocated per-thread (corresponding to ctx->scratchpad in the command
stream) and used for register spilling.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 3bb780ecb9 pan/midg/disasm: Check for int varying ops
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 7e052d9332 pan/midgard: Remove "aliasing"
It was a crazy idea that didn't pan out. We're better served by a good
copyprop pass. It's also unused now.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig 3174bc9972 panfrost: Promote uniform registers late
Rather than creating either a load or a uniform register read with a
fixed beginning offset, we always create a load and then promote to a
uniform register later. This will allow us to promote in a register
pressure aware manner.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:34 -07:00
Alyssa Rosenzweig aa03159120 pan/midgard: Call scheduler/RA in a loop
This will allow us to insert instructions as a result of register
allocation, permitting spilling to be implemented. As a side effect,
with the assert commented out this would fix a bunch of glamor crashes
(due to RA failures) so MATE becomes useable.

Ideally we'll have scheduling or RA actually sorted out before the
branch point but if not this gives us a one-line out to get X working...

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:33 -07:00
Alyssa Rosenzweig 1cabb8a706 pan/midgard: Remove custom register selection callback
What we have is equivalent to the default callback; let's use that.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 08:20:33 -07:00
Alyssa Rosenzweig 9eea8423a0 panfrost/midgard: Use generic outmod type
It could be midgard_outmod_float or midgard_outmod_int; don't assume
it's one or the other. Fixes -Wenum-conversion warnings.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-12 16:23:52 -07:00
Alyssa Rosenzweig 6d8490f900 panfrost: Fix build warnings
A bunch of these are from asserts not being compiled in 32-bit mode
(once Erik's ASSERTABLE stuff is merged, we'll want to switch).

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-12 07:38:37 -07:00
Tomeu Vizoso 838374b6dd Revert "panfrost/midgard: Use _safe iterator"
This reverts commit 812ce2ce9e.

We massively regress with the reverted patch. So in the meantime, take
it out.

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
2019-07-11 16:53:42 +02:00
Tomeu Vizoso 812ce2ce9e panfrost/midgard: Use _safe iterator
Fixes this assertion:

../mesa/src/panfrost/midgard/midgard_schedule.c:507:schedule_block: Assertion `ins == __next && "use _safe iterator"' failed.
Trace/breakpoint trap

Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-11 15:06:51 +02:00
Alyssa Rosenzweig bb483a9166 panfrost: Clamp point size
It's not clear the hardware really has a maximum which confuses dEQP;
clamp to whatever we report as our maximum.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-10 11:30:00 -07:00
Alyssa Rosenzweig ec2a59cd7a panfrost: Move non-Gallium files outside of Gallium
In preparation for a Panfrost-based non-Gallium driver (maybe
Vulkan...?), hoist everything except for the Gallium driver into a
shared src/panfrost. Practically, that means the compilers, the headers,
and pandecode.

Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-10 10:43:23 -07:00