KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Eric Engestrom	2c67457e5e	util/list: rename LIST_ENTRY() to list_entry() This follows the Linux kernel convention, and avoids collision with macOS header macro. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6751 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6840 Cc: mesa-stable Signed-off-by: Eric Engestrom <eric@igalia.com> Acked-by: David Heidelberg <david.heidelberg@collabora.com> Reviewed-by: Yonggang Luo <luoyonggang@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17772>	2022-07-28 10:10:44 +00:00
Marek Olšák	c9ca8abe4f	Change all debug_assert calls to assert Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Acked-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17403>	2022-07-10 00:50:35 +00:00
Connor Abbott	c601ba332b	ir3/sched: Fix could_sched() determination This needs to be accurate so that when we split and then schedule a new a0.x/a1.x/p0.x write we will eventually make progress. It wasn't taking the kill_path into account which could create an infinite loop as we keep scheduling writes whose uses are blocked because they are memory instructions not on the kill_path. Closes: #6413 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16635>	2022-06-22 10:09:13 +00:00
Emma Anholt	d60282f5d2	freedreno/ir3: Make sched nodes before adding deps. The mark_kill_path() during dep setup follows SSA srcs, which when a phi is involved may include a def from later in the same block, that we hadn't created yet. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15982>	2022-04-19 18:45:29 +00:00
Connor Abbott	9cc42242d5	ir3/sched: Support multiple destinations Note: this is a behavior change for arrays, because it will count the entire array instead of just the components written in the register pressure calculation. However this is more accurate since this matches how RA works. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14107>	2022-03-10 17:15:29 +00:00
Connor Abbott	e6b35d606d	ir3/sched: Rename tex/sfu to sy/ss This now covers e.g. cat6 instructions as well, and ss will cover instructions writing shared regs as well. This is split out from the previous change to avoid too much churn and shouldn't cause any functional changes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>	2022-01-07 14:26:08 +00:00
Connor Abbott	0cc4aca345	ir3: Use new (sy)/(ss) stall helpers in the compiler This fixes a few bad assumptions in the pre-RA and post-RA scheduler, for example that (sy) is only for texture instructions and (ss) is only for SFU instructions and (sy) and (ss) producers will always take the same number of cycles. This means we now start doing latency hiding for cat6 instructions like ldib and ldc. It also should make us hide latency more aggressively, since the number used for (sy) stall cycles was way lower than the real numbers for everything except ldc. Finally it unifies the various places (ss) soft nops were calculated. selected shader-db results: total nops in shared programs: 345278 -> 358959 (3.96%) nops in affected programs: 215622 -> 229303 (6.34%) helped: 690 HURT: 2430 helped stats (abs) min: 1 max: 125 x̄: 11.40 x̃: 5 helped stats (rel) min: 0.53% max: 100.00% x̄: 24.19% x̃: 18.52% HURT stats (abs) min: 1 max: 501 x̄: 8.87 x̃: 5 HURT stats (rel) min: 0.00% max: 9900.00% x̄: 52.36% x̃: 14.29% 95% mean confidence interval for nops value: 3.78 4.99 95% mean confidence interval for nops %-change: 28.21% 42.66% Nops are HURT. total mov in shared programs: 75049 -> 74110 (-1.25%) mov in affected programs: 15754 -> 14815 (-5.96%) helped: 566 HURT: 455 helped stats (abs) min: 1 max: 36 x̄: 4.52 x̃: 3 helped stats (rel) min: 0.83% max: 100.00% x̄: 35.85% x̃: 30.00% HURT stats (abs) min: 1 max: 35 x̄: 3.55 x̃: 3 HURT stats (rel) min: 0.00% max: 1100.00% x̄: 63.60% x̃: 25.00% 95% mean confidence interval for mov value: -1.25 -0.58 95% mean confidence interval for mov %-change: 2.92% 14.02% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total last-baryf in shared programs: 80468 -> 67670 (-15.90%) last-baryf in affected programs: 63676 -> 50878 (-20.10%) helped: 309 HURT: 147 helped stats (abs) min: 1 max: 260 x̄: 49.20 x̃: 24 helped stats (rel) min: 0.60% max: 98.81% x̄: 37.92% x̃: 40.91% HURT stats (abs) min: 1 max: 115 x̄: 16.35 x̃: 12 HURT stats (rel) min: 0.96% max: 1933.33% x̄: 45.55% x̃: 7.89% 95% mean confidence interval for last-baryf value: -33.03 -23.10 95% mean confidence interval for last-baryf %-change: -21.52% -0.50% Last-baryf are helped. total sstall in shared programs: 133997 -> 126398 (-5.67%) sstall in affected programs: 86866 -> 79267 (-8.75%) helped: 1893 HURT: 598 helped stats (abs) min: 1 max: 77 x̄: 6.06 x̃: 4 helped stats (rel) min: 0.71% max: 100.00% x̄: 32.82% x̃: 16.67% HURT stats (abs) min: 1 max: 65 x̄: 6.47 x̃: 6 HURT stats (rel) min: 0.00% max: 900.00% x̄: 65.51% x̃: 25.00% 95% mean confidence interval for sstall value: -3.39 -2.71 95% mean confidence interval for sstall %-change: -12.19% -6.24% Sstall are helped. total systall in shared programs: 350304 -> 288234 (-17.72%) systall in affected programs: 234855 -> 172785 (-26.43%) helped: 1456 HURT: 260 helped stats (abs) min: 1 max: 574 x̄: 46.42 x̃: 27 helped stats (rel) min: 0.19% max: 100.00% x̄: 39.43% x̃: 36.06% HURT stats (abs) min: 1 max: 757 x̄: 21.20 x̃: 8 HURT stats (rel) min: 0.00% max: 180.95% x̄: 24.82% x̃: 12.50% 95% mean confidence interval for systall value: -39.31 -33.03 95% mean confidence interval for systall %-change: -31.49% -27.90% Systall are helped. total waves in shared programs: 236732 -> 235142 (-0.67%) waves in affected programs: 6142 -> 4552 (-25.89%) helped: 535 HURT: 17 helped stats (abs) min: 2 max: 8 x̄: 3.08 x̃: 2 helped stats (rel) min: 12.50% max: 75.00% x̄: 28.78% x̃: 25.00% HURT stats (abs) min: 2 max: 6 x̄: 3.53 x̃: 4 HURT stats (rel) min: 16.67% max: 75.00% x̄: 37.35% x̃: 33.33% 95% mean confidence interval for waves value: -3.04 -2.72 95% mean confidence interval for waves %-change: -28.10% -25.39% Waves are helped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>	2022-01-07 14:26:08 +00:00
Connor Abbott	23a5f1a5ac	ir3: Stop inserting nops during scheduling Not necessary since nothing uses it anymore. This might have a slight effect on spilling with multiple blocks, but no shader-db difference because nothing spills. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>	2021-11-17 13:41:47 +00:00
Connor Abbott	d9a91318b1	ir3/sched: Rewrite delay handling The old code walked the instructions between each ready instruction and each of its parents for every instruction, which can quickly become accidentally quadratic. Instead we keep track of the current "instruction pointer" of the to-be-scheduled instruction, and for each ready instruction calculate an "earliest possible IP" which is the IP that needs to be reached before we can schedule it. Because this stays constant as soon as an instruction becomes ready, we never have to recompute it and each call to ir3_delay_calc_prera() becomes a simple comparison and subtract. We only need to iterate over the children and update their earliest_ip when scheduling an instruction, and we already do that in util_day_prune_head() so it should be cheap. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>	2021-11-17 13:41:47 +00:00
Connor Abbott	508f917d8c	util/dag: Make edge data a uintptr_t Nobody was actually using it as a pointer, and I'm going to introduce a shared function which relies on it not being a pointer so let's fix this once and for all. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13722>	2021-11-17 13:41:47 +00:00
Rob Clark	344683c932	freedreno/ir3: Fix sched debug msgs Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12923>	2021-09-18 20:24:49 +00:00
Emma Anholt	bda26dfcfc	freedreno/ir3: Reduce choose_instr_dec() and _inc() overhead. If you didn't have a freed+ready instruction, you'd redo the live_effect and check_instr() logic multiple times per instr. Replace the multiple loops in each function with a ranking that I think is more readable, reducing the overhead in the process. debugoptimized dEQP-GLES31.functional.ubo.random.all_per_block_buffers.20 runtime goes from ~3.5s -> ~3.0s on my lazor. No shader-db change. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11855>	2021-07-19 23:16:54 +00:00
Connor Abbott	177138d8cb	ir3: Reformat source with clang-format Generated using: cd src/freedreno/ir3 && clang-format -i {*,.}/.c {*,.}/.h -style=file Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11801>	2021-07-12 20:57:21 +00:00
Connor Abbott	17f7453d45	ir3: Add subgroup pseudoinstructions Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Connor Abbott	43e926a3af	ir3/sched: Handle branch condition in split_pred() Before this, if there was a block with multiple things writing p0.x, it was a tossup whether the right one would be used as the branch condition. Found by inspection. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Connor Abbott	bb3212dd4d	ir3: Fix infinite loop in scheduler when splitting When we go to split e.g. a p0.x producer, the only other instructions ready to schedule are often only p0.x producers. It could happen that they all have a lower priority than the split instruction. Then we would immediately schedule the split instruction again, then again try to schedule one of the other producers, be blocked, and split it, around and around again, leading to an infinite loop. The following commit triggered this with dEQP-GLES3.functional.shaders.discard.dynamic_loop_always on a3xx. Fixes: `d2f4d33` ("freedreno/ir3: new pre-RA scheduler") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6752>	2021-07-08 16:02:41 +00:00
Connor Abbott	ea325226d6	ir3: Add foreach_dst/foreach_dst_n And cleanup a few places I know of that are open-coding it Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11565>	2021-06-29 08:08:12 +00:00
Connor Abbott	9133999430	ir3/sched: Speed up live_effect If we've identified another use that isn't scheduled yet, we can break right away rather than iterating through all the other uses. While this could be optimized further, this simple change makes dEQP-VK.subgroups.ballot_broadcast.compute.subgroupbroadcast_ivec4 go from 40 seconds to 1.9 seconds on a release build according to my unscientific testing. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11613>	2021-06-28 16:26:24 +00:00
Connor Abbott	50994eeabf	ir3/sched: Convert to srcs/dsts arrays Also change the indexing in ir3_delayslots, so it's finally sane! To do this we also have to change foreach_ssa_src_n to index srcs instead of regs, so that the indexing stays in sync. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11469>	2021-06-23 17:20:29 +00:00
Connor Abbott	9af795d9b9	ir3: Make ir3_instruction::address a normal register This fixes an annoying mismatch in the indices between foreach_ssa_src_n and ir3_delayslots(), and lets us remove a bunch of other special cases. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11469>	2021-06-23 17:20:29 +00:00
Connor Abbott	b1a1de76e8	ir3/sched: Consider unused destinations when computing live effect If an instruction's destination is unused, then we shouldn't penalize it. For example, this helps us schedule atomic operations whose results aren't read. This works around RA failures when CSE is enabled in some robustness2 tests. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:06 -07:00
Connor Abbott	ba8efeb7fa	ir3/sched: Make collects count against tex/sfu limits In a scenario where there are a lot of texture fetches with constant coordinates, this prevents the scheduler from scheduling all the setup instructions after the first group of textures has been scheduled because they are the only non-syncing thing and scheduling them didn't decrease tex_delay. Collects with immed/const sources will turn into moves of those sources, so we should treat them the same. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:06 -07:00
Connor Abbott	8b15c2f30c	ir3/sched: Don't schedule collect early I don't think there was ever a good reason to do this, but when we start folding constants/immediates into collect, this can become actively harmful. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:24:06 -07:00
Connor Abbott	58d82add87	ir3: Rewrite delay calculation The old delay calculation relied on the SSA information staying around, and wouldn't work once we start introducing phi nodes and making "normal" values defined in multiple blocks not array regs anymore. What's worse is that properly inserting phi nodes when splitting live ranges would make that code even more complicated, and this was the last place post-RA that actually needed that information. The new version only compares the physical registers of sources and destinations. It works by going backwards up to a maximum number of cycles, so it might be slightly slower when the definition is closer but should be faster when it is farther away. To avoid complicating the new method, the old method is kept around, but only for pre-RA scheduling and it can therefore be drastically simplified as the array case can be dropped. ir3_delay_calc() is split into a few variants to avoid an explosion of boolean arguments in users, especially now that merged_regs now has to be passed to it. The new method is a little more complicated when it comes to handling (rptN), because both the assigner and consumer may be (rptN). This adds some unit tests for those cases, in addition to dropping the to-SSA code in the test harness since it's no longer needed. Finally, ir3_legalize has to be switched to using physical registers for the branch condition. This was the one place where IR3_REG_SSA remained after RA. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Connor Abbott	e1d7240576	ir3: Readd support for translating NIR phi nodes This is roughly based on the support removed a while ago, but it handles sources better by associating each source with a predecessor block. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9842>	2021-06-10 12:20:38 -07:00
Rob Clark	09f64f74db	freedreno/ir3: Fix use after free If the tex/sfu ssa src is from a different block than the one currently being scheduled, we do not have a valid sched-node. So fallback to previous behavior rather than dereference an invalid ptr. Fixes: `7821e5a3f8` ("ir3/sched: Don't penalize uses of already-waited tex/SFU") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10306>	2021-06-09 00:37:15 +00:00
Connor Abbott	3c8a5d7e17	ir3: Rework outputs Instead of using a separate outputs array, make the "end" instruction (or chmask) take the outputs as sources. This works better for the new RA, because it better models the fact that outputs are consumed all at the same time. With the old model, each output collect would be assumed dead after it was processed and subsequent collects could use it when inserting shuffle code, which wouldn't work, and the new RA also deletes collect instructions after lowering them to moves so the information would be gone after RA. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Connor Abbott	af7f29a78e	ir3/sched: Use correct src index Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10591>	2021-05-03 19:52:31 +00:00
Danylo Piliaiev	9dd9424a85	turnip: implement VK_EXT_shader_demote_to_helper_invocation The "demote" intrinsic has the semantics of D3D discard, which means it doesn't change the control flow, allowing derivatives to work. On A6xx there is no known way to check whether invocation was demoted, thus we use nir_lower_is_helper_invocation. Add "logical" OPC_DEMOTE which is later translated to "kill". Such separation is necessary to run "kill" specific optimizations which are invalid for "demote". Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>	2021-04-19 17:11:36 +00:00
Connor Abbott	2deead184c	ir3/sched: Don't schedule too many tex/SFU instructions Consider a simple loop that does a series of texture instructions and then reduces the results: vec4 sum = vec4(0); for (int i = 0; i < N; i++) { sum += texture(...); } Assume that the loop is unrolled and we schedule the resulting basic block. Right now, after we schedule the first texture instruction, the only instructions available to schedule that don't incur a sync are the instructions to setup the second texture instruction. So we keep picking the texture instructions, no matter how large N is, resulting in a pathological schedule for register pressure when N is very large: sum1 = texture(...); sum2 = texture(...); sum3 = texture(...); ... sum = sum1 + sum2 + sum3 + ...; In particular this happens with some CTS tests for VK_EXT_robustness2, where a loop like that with many iterations is marked as [[unroll]], forcing NIR to unroll it. This solution is a balance between the current approach and always scheduling for register pressure (and ignoring sync's). We only allow a certain number of texture fetches to be in flight before considering textures to "sync", even though they don't really, both because they likely will sync in reality (overflowing the internal queue of waiting texture instructions) and because at some point we need the normal algorithm to kick in and start lowering register pressure. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7571>	2021-04-14 17:33:58 +00:00
Connor Abbott	7821e5a3f8	ir3/sched: Don't penalize uses of already-waited tex/SFU Once we insert a use of a given tex or SFU instruction, then we must wait for that tex/SFU instruction (as well as all earlier ones) to complete, so we shouldn't penalize further uses, even if a subsequent tex/SFU instruction gets scheduled after the first use. This especially matters after the next commit when we start forcibly breaking up long sequences of texture instructions, since if we schedule a group of 8 texture instructions then we want to schedule the uses of those instructions in parallel with the next 8 texture instructions to reduce register pressure. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7571>	2021-04-14 17:33:58 +00:00
Danylo Piliaiev	72a9f315db	ir3: make mark_kill_path exit early if instr is already seen Would bring down its complexity in pathological cases. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9386>	2021-03-04 10:52:06 +00:00
Rob Clark	f598786775	freedreno/sched: reset delay counters at start of block Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	c1d33eed41	freedreno/ir3: make foreach_ssa_src declar cursor ptr Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	721147a05d	freedreno/ir3/deps: report progress Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	d6706fdc46	freedreno/ir3/sched: try to avoid syncs Similar to what we do in postsched. It is useful for pre-RA sched to be a bit aware of things that would cause syncs. In particular for the tex fetches, since the vecN src/dst tends to limit postsched's ability to re-order them. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	d95a6e3a0c	freedreno/ir3/sched: avoid scheduling outputs If an instruction's only use is as an output, and it increases register pressure, then try to avoid scheduling it until there are no other options. A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these immed loads to the end of the shader. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	9701008d64	freedreno/ir3/sched: awareness of partial liveness Realize that certain instructions make a vecN live, and account for this, in hopes of scheduling the remaining components of the vecN sooner. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Rob Clark	d2f4d332db	freedreno/ir3: new pre-RA scheduler This replaces the depth-first search scheduler with a more traditional ready-list scheduler. It primarily tries to reduce register pressure (number of live values), with the exception of trying to schedule kills as early as possible. (Earlier iterations of this scheduler had a tendency to push kills later, and in particular moving texture fetches which may not be necessary ahead of kills.) Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4440>	2020-04-13 20:47:28 +00:00
Connor Abbott	de7d90ef53	ir3: Plumb through support for a1.x This will need to be used in some cases for the upcoming bindless support, plus ldc.k instructions which push data from a UBO to const registers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>	2020-04-09 15:56:55 +00:00
Rob Clark	a0de0db0e4	freedreno/ir3: small cleanup and comments Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4272>	2020-03-27 22:41:36 +00:00
Rob Clark	b6eb11295a	freedreno/ir3: split out has_latency_to_hide() Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>	2020-03-10 16:01:39 +00:00
Rob Clark	64ae2ef8bb	freedreno/ir3: remove extra nops inserted in scheduler They were inserting a nop between back to back SFU instrucions. But that doesn't actually appear to be required. And they get stripped out later anyways before legalize. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4071>	2020-03-10 16:01:39 +00:00
Rob Clark	2cf4b5f29e	freedreno/ir3: track half-precision live values In schedule live value tracking, differentiate between half vs full precision. Half-precision live values are less costly than full precision. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3989>	2020-02-28 16:53:41 +00:00
Rob Clark	4353b3c1c5	freedreno/ir3: don't hide latency when there is none to hide Current scheduler thresholds try to ensure there are warps available to switch to when hiding texture fetch latency. But if there is none to hide, we should allow scheduler to use more registers to reduce nops. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3989>	2020-02-28 16:53:41 +00:00
Rob Clark	304b50c9f8	freedreno/ir3: move block-scheduling into legalize We want to do this only once. If we have post-RA sched pass, then we don't want to do it pre-RA. Since legalize is where we resolve the branch/jumps, we might as well move this into legalize. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	093c94456b	freedreno/ir3: move nop padding to legalize This way we can deal with it in one place, after all the blocks have been scheduled. Which will simplify life for a post-RA sched pass. This has the benefit of already taking into account nop's that legalize has to insert for non-delay related reasons. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	c803c662f9	freedreno/ir3: split out delay helpers We're going to want these also for a post-RA sched pass. And also to split nop stuffing out into it's own pass. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3569>	2020-02-01 02:40:22 +00:00
Rob Clark	3b8feefd9c	freedreno/ir3: add iterator macros So many open coded list iterators were getting annoying. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-13 09:25:40 -08:00
Rob Clark	ad92aa36ac	freedreno/ir3: add scheduler traces Add some infrastructure to trace scheduler decisions. The next patch will add some more traces, just splitting this out to reduce clutter. Signed-off-by: Rob Clark <robdclark@chromium.org>	2019-12-13 09:25:40 -08:00

1 2

65 Commits