KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Rob Clark	9bbfae6265	freedreno/a3xx/compiler: collapse nop's with repeat Easier than making more extensive use of rpt, and the more compact shaders seem to bring some bit of performance boost. (Perhaps repeat flag benefits are more than just instruction cache, possibly it saves on instruction decode as well?) Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-23 14:58:23 -05:00
Rob Clark	5993723471	freedreno/a3xx/compiler: scheduling/legalize fixes It seems the write-after-read hazard that applies to texture fetch instructions, also applies to sfu instructions. Also, cat5/cat6 instructions do not have a (ss) bit, so in these cases we need to insert a dummy nop instruction with (ss) bit set. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-19 12:01:26 -05:00
Rob Clark	d73b2c0517	freedreno/a3xx/compiler: use (ss) for WAR hazards Seems texture sample instructions don't immediately consume there src(s). In fact, some shaders from blob compiler seem to indiciate that it does not even count the texture sample instructions when calculating number of delay slots to fill for non-sample instructions. (Although so far it seems inconclusive as to whether this is required.) In particular, when a src register of a previous texture sample instruction is clobbered, the (ss) bit is needed to synchronize with the tex pipeline to ensure it has picked up the previous values before they are overwritten. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Rob Clark	e8cca57a3f	freedreno/a3xx/compiler: fix RA typo Was supposed to be a '+', otherwise we end up with a negative offset and choosing registers below the assigned range. This seems to fix the scheduling mystery "solved" by adding in extra delay slots. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Rob Clark	579473f8f8	freedreno/a3xx/compiler: handle kill properly (new compiler) Since 'kill' does not produce a result, the new compiler was happily optimizing them out. We need to instead track 'kill's similar to outputs. But since there is no non-predicated kill instruction, (and for flattend if/else we do want them to be predicated), we need to track the topmost branch condition on the stack and use that as src arg to the kill. For a kill at the topmost level, we have to generate an immediate 1.0 to feed into the cmps.f for setting the predicate register. Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-16 08:17:23 -05:00
Rob Clark	554f1ac00c	freedreno/a3xx/compiler: new compiler The new compiler generates a dependency graph of instructions, including a few meta-instructions to handle PHI and preserve some extra information needed for register assignment, etc. The depth pass assigned a weight/depth to each node (based on sum of instruction cycles of a given node and all it's dependent nodes), which is used to schedule instructions. The scheduling takes into account the minimum number of cycles/slots between dependent instructions, etc. Which was something that could not be handled properly with the original compiler (which was more of a naive TGSI translator than an actual compiler). The register assignment is currently split out as a standalone pass. I expect that it will be replaced at some point, once I figure out what to do about relative addressing (which is currently the only thing that should cause fallback to old compiler). There are a couple new debug options for FD_MESA_DEBUG env var: optmsgs - enable debug prints in optimizer optdump - dump instruction graph in .dot format, for example: http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot.png http://people.freedesktop.org/~robclark/a3xx/frag-0000.dot At this point, thanks to proper handling of instruction scheduling, the new compiler fixes a lot of things that were broken before, and does not appear to break anything that was working before[1]. So even though it is not finished, it seems useful to merge it in it's current state. [1] Not merged in this commit, because I'm not sure if it really belongs in mesa tree, but the following commit implements a simple shader emulator, which I've used to compare the output of the new compiler to the original compiler (ie. run it on all the TGSI shaders dumped out via ST_DEBUG=tgsi with various games/apps): `163b6306b1` Signed-off-by: Rob Clark <robclark@freedesktop.org>	2014-02-03 18:26:53 -05:00

6 Commits