KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Connor Abbott	168c42290f	ir3: Don't calculate num_samp ourselves In addition to duplicating what core NIR does better, this was wrong for Vulkan, where it should be 0 as there are no non-bindless samplers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5519>	2020-06-17 14:36:50 +00:00
Connor Abbott	6f2981176d	ir3: Pass reserved_user_consts to ir3_shader_from_nir() ir3_shader_from_nir() calls ir3_optimize_nir(), which currently sets up the const state. However, we need to know the number of user consts reserved by the driver before setting up the const state, which means that this information needs to be passed into ir3_shader_from_nir() somehow rather than being set in the shader. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5500>	2020-06-17 13:13:05 +00:00
Rob Clark	680ca5b393	freedreno/ir3: add post-scheduler cp pass A pass to eliminate extra mov's from an array. We need to do this after scheduling so we know that there are not any potentially conflicting array writes between the original `mov` and it's use(s). Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/2124 Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	a60d48a863	freedreno/ir3/cp: extract valid_flags We'll also need this in the postsched-cp pass. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	5f1f8f7b17	freedreno/ir3: delay test support for vectorish instructions Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	92d6eb4dd5	freedreno/ir3: add helpers to move instructions A bit cleaner than open coding the list manipulation. Plus I want to use it in the next patch, rather than adding more open coded list futzing. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	9eed0c6011	freedreno/ir3/delay: calculate delay properly for (rptN)'d instructions When a sequence of same instruction is encoded with repeat flag, destination registers are written on successive cycles. Teach the delay calculation about this. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	c3b30963dd	freedreno/ir3: add test for delay slot calculation Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	a69d28769a	freedreno/ir3/print: print (r) flag Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	cd376a1434	freedreno/ir3/legalize: don't allow (nopN) if (rptN) These two encodings are mutually exclusive. If the instruction is a vector(ish) `(rptN)` instruction, then we can't fold a `(nopN)` post- delay into it. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	f35f711c71	freedreno/ir3/cp: properly handle already-folded RELATIV In the `try_swap_mad_two_srcs()` case, valid_flags() gets called both for the src that we want to try to fold, and for the other src that we are trying to swap to make that possible. It can happen in the 2nd case that a RELATIV src has already been folded. Since `ssa()` returns non- null in both the `IR3_REG_SSA` and `IR3_REG_ARRAY` cases (in the later case, it is the dependent array access that the current instruction cannot be moved ahead of), we need to explicitly check that the src reg we are looking at is still an SSA src. Reported-by: Jonathan Marek <jonathan@marek.ca> Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	1bee79996b	freedreno/ir3/validate: also check instr->address Verify that instructions which have a relative src and/or dest, have `instr->address`. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	f598786775	freedreno/sched: reset delay counters at start of block Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5280>	2020-06-16 20:56:15 +00:00
Rob Clark	28a14787c0	freedreno/ir3: don't rely on intr->num_components It is better to use `nir_intrinsic_dest_components()` which also handles the case of intrinsics with a fixed number of dest components. Somehow this starts showing up with a nir_serialize round-trip with shader-cache. But we really shouldn't have been relying on `intr->num_components` directly. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5371>	2020-06-16 02:48:18 +00:00
Rob Clark	1a33faea8c	freedreno/ir3: move the libdrm dependency out of shared code The only reason for this dependency was the fd_bo used for the uploaded shader. But this isn't used by turnip. Now that we've unified the cleanup path from gallium, it isn't hard to pull the fd_bo upload/free parts into ir3_gallium. This cleanup has the added benefit that the shader disk-cache will not have to deal with it. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5476>	2020-06-15 15:46:37 +00:00
Jonathan Marek	1d9e6e456a	freedreno/ir3: fix ir3_nir_move_varying_inputs ir3_nir_move_varying_inputs is broken when there a load input outside of the first block which depends on the result of a previous load input. This simplification/rework avoids the problem, and should also be faster. Fixes this dEQP-VK test: dEQP-VK.pipeline.multisample_interpolation.offset_interpolate_at_pixel_center.128_128_1.samples_2 Signed-off-by: Jonathan Marek <jonathan@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5465>	2020-06-14 17:53:47 +00:00
Rob Clark	ee29c682fe	freedreno/ir3: limit pre-fetched tex dest Teach RA to setup additional interference to prevent textures fetched before the FS starts from ending up in a register that is too high to encode. Fixes mis-rendering in multiple playcanv.as webgl apps. Note that the regression was not actually 733bee57eb8's fault, but that was the commit that exposed the problem. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3108 Fixes: `733bee57eb` ("glsl: lower samplers with highp coordinates correctly") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>	2020-06-11 21:59:54 +00:00
Rob Clark	f80092dad2	freedreno/ir3: remove RA "q-values" optimization This is mainly the "piglit optimization" (ie, since piglit launches an separate process for for each test). It was never wired up for a6xx, and makes register class setup unnecessarily complicated. Remove it to simplify the next patch. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>	2020-06-11 21:59:54 +00:00
Rob Clark	562aaea07c	freedreno/ir3: respect tex prefetch limits Refactor a bit the limit checking in the bindless case, and add tex/samp limit checking for the non-bindless case, to ensure we do not try to prefetch textures which cannot be encoded in the # of bits available. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>	2020-06-11 21:59:54 +00:00
Rob Clark	4cabc25fa4	freedreno/ir3: add debug code to print conflicting half-regs I keep re-typing this from time to time when debugging various things. Which is dumb. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5431>	2020-06-11 21:59:54 +00:00
Eric Anholt	0bacb280a8	freedreno/ir3: Handle cases where we decide not to lower UBO 0 loads. We advertize 4096 vec4s of GL uniform storage, but the HW can only store 512 vec4s in the const buffer. Closes: #3049 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>	2020-06-05 13:43:30 -07:00
Eric Anholt	e349f50279	freedreno/ir3: Drop the max_const on a6xx to 512. The GLES blob on the p3a limits constlen to 512 between VS and FS across a6xx gpu ids (615, 630, 640, and 650). Experimentally, exceeding that limit in any one stage results in rendering corruption or GPU hangs (though my most detailed testing had a loop limit in a uniform, so that may the cause of the hang). Clamp the limit we use inside of a shader so we don't exceed it within a stage. This commit doesn't resovle limiting inter-stage. Experimentally, I've found that I can push up to a total of ~768 vec4s between VS and FS on a630, with or without uniform updates between each draw. We'll need to do some shader key-based limiting of constlen at draw time to respect that limit, but that's left for future work, and this commit is enough for the google earth case that initiated this work. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>	2020-06-05 13:36:29 -07:00
Eric Anholt	486b894307	freedreno/ir3: Account for driver params in UBO max const upload. The const state setup needs to be able to push its driver params, so account for them in the analyze_ubo_ranges. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>	2020-06-05 13:36:29 -07:00
Eric Anholt	a25347ab92	freedreno/ir3: Stop shifting UBO 1 down to be UBO 0. It turns out the GL uniforms file is larger than the hardware constant file, so we need to limit how many UBOs we lower to constbuf loads. To do actual UBO loads, we'll need to be able to upload UBO 0's pointer or descriptor. No difference on nohw 1 UBO update drawoverhead case (n=35). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>	2020-06-05 13:36:29 -07:00
Eric Anholt	9e58ab09ff	freedreno/ir3: Drop unnecessary alignment of pushed UBO size. The analysis pass gives us vec4-aligned size, and all of our other constbuf allocations here are in vec4 units, so we can just divide by 16. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>	2020-06-05 13:36:29 -07:00
Eric Anholt	07ec745014	freedreno/ir3: Stop pushing immediates once we've filled the constbuf. If we filled the constbuf up with UBOs, we may need to avoid generating more immediate push constants. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>	2020-06-05 13:36:29 -07:00
Eric Anholt	ab29f2da42	freedreno/ir3: Refactor ir3_cp's lower_immed(). There was duplicated handling in the callers that we can just move inside. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5273>	2020-06-05 13:36:29 -07:00
Rob Clark	ebcf3545db	freedreno/ir3: split kill from no_earlyz Unlike other conditions which prevent early-discard of fragments, kill does not prevent early LRZ test. Split `has_kill` from `no_earlyz` so we can take advantage of this. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5298>	2020-06-04 02:34:54 +00:00
Timothy Arceri	04dbf709ed	nir: add callback to nir_remove_dead_variables() This allows us to do API specific checks before removing variable without filling nir_remove_dead_variables() with API specific code. In the following patches we will use this to support the removal of dead uniforms in GLSL. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4797>	2020-06-03 02:22:23 +00:00
Dylan Baker	a8e2d79e02	meson: use gnu_symbol_visibility argument This uses a meson builtin to handle -fvisibility=hidden. This is nice because we don't need to track which languages are used, if C++ is suddenly added meson just does the right thing. Acked-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4740>	2020-06-01 18:59:18 +00:00
Kristian H. Kristensen	f4e64e9f53	freedreno/ir3: Avoid {0} initializer for struct reginfo First element is not a scalar. Just initialize the struct like we do elsewhere. src/freedreno/ir3/disasm-a3xx.c:958:33: warning: suggest braces around initialization of subobject [-Wmissing-braces] Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5174>	2020-05-26 12:46:18 -07:00
Eric Anholt	5ec3747fbe	freedreno/ir3: Use RESINFO for a6xx image size queries. The closed GL driver uses resinfo on images with the writeonly flag (using the texture-path's getsize only for readonly images). The closed vulkan driver seems to use resinfo regardless. Using resinfo doesn't need any fixups after the instruction. It also avoids one of the needs for the TEX_CONST state for the image, which is awkward to set up in the GL driver. The new handler goes into ir3_a6xx to be next to the other current image code, but the a4xx version is left in place because it wants a bunch of sampler helpers. Fixes assertion failure in dEQP-VK.image.image_size.buffer.readonly_32. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>	2020-05-26 18:17:46 +00:00
Eric Anholt	2ec4c53ef9	freedreno/ir3: Move handle_bindless_cat6 to compiler_nir and reuse. There was an open coded version for ldc, and now we can drop that. I needed to do it for resinfo as well. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>	2020-05-26 18:17:46 +00:00
Eric Anholt	2068b01430	freedreno/ir3: Refactor out IBO source references. All the users of the unsigned result just wanted an ir3_instruction to reference. Move a6xx's helpers to ir3_image.c and inline the old unsigned results version. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>	2020-05-26 18:17:46 +00:00
Eric Anholt	00b9099dd5	freedreno: Set the immediate flag in a4/a5xx resinfos. Noticed comparing our RESINFO asm to qcom's for the same test, and if I drop this bit their disasm switches from immediate to reg. ldgb seems to have the same behavior. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>	2020-05-26 18:17:46 +00:00
Eric Anholt	ae00da5ddb	freedreno: Fix resinfo asm, which doesn't have srcs besides IBO number. In the process, clarify what's going on with the LDC/LDIB case. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>	2020-05-26 18:17:46 +00:00
Eric Anholt	c1cb75678d	freedreno: Add more resinfo/ldgb testcases. Since I'm going to start using the resinfo opcode, make sure we can disasm the blob's instances of it that I've found. And, since resinfo disasm will impact ldgb on pre-a6xx, include some of those too. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>	2020-05-26 18:17:46 +00:00
Eric Anholt	5d4a911d8c	freedreno: Fix printing of unused src in disasm of cat6 RESINFO. Compare to QC's disasm right next to ours, and we clearly had an extra src that wouldn't make sense. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3501>	2020-05-26 18:17:46 +00:00
Rob Clark	3c355f1ae8	freedreno/ir3/validate: add checking for types and opcodes For cases where instructions have a src and/or dst type, validate that it matches the src/dst register types. And for cases where there are different opcodes for half vs full, validate that the opcode matches. Now that we maintain this properly throughout the stages of the ir, we can drop the fixups from the RA pass. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	f484d63617	freedreno/ir3: add helpers to deal with src/dst types Add some helpers to properly maintain src/dst types, and in the cases where opcode depends on src or dst type, maintain that as well. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	3561d34fff	freedreno/ir3: add simple validate pass We can add to this as we notice other things that are worth validating between ir3 passes. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	554f3d54ca	freedreno/ir3: fix mismatched wrmask for overlapping VS inputs Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	16cd232dbc	freedreno/ir3/cp: fix cmps folding When we start doing cp iteratively, we hit the case that we've already `cmps.s.*` into a `cmps.s.ne p0.x, ...`.. when we try to do that again we can invert the logic condition. So check specifically the condition to prevent this. TODO we could maybe be more clever about this to combine conditions. But why isn't that happening in nir? For example, see dEQP-GLES31.functional.ssbo.layout.single_basic_array.packed.bool Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	39de27d3b9	freedreno/ir3/print: print cat2 condition Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	7b86b5ed7d	freedreno/ir3: fix immed type in create_addr0() We can also remove a bunch of manual src/dst flag munging, since the instruction builders handle this automatically now. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	3474ba53b5	freedreno/ir3/cf: handle multiple cov's properly There can be multiple (for ex.) f32f16's from a single source, in particular appearing in different blocks. We need to update all uses of the src which had conversion folded in, not all the uses of the individual cov. Also, to avoid invalidating the ssa use info that was gathered at the beginning of the pass, don't actually eliminate the cov, but instead change it to a simple mov that the cp pass can gobble up. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	3db5d146e9	freedreno/ir3: fix mismatched flags on split We have to fixup the meta:split half flag, because `ir3_split_dest()` is called before we fixup the dest type. But we should fixup both the split src and dest, as well as the thing it is splitting. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	b24b6a8365	freedreno/ir3/group: fix for half-regs If we're inserting a mov to resolve a conflict between meta:collect's (ie. for .zyx type swizzles, etc), we should use the correct precision. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	fcfe5eff63	freedreno/ir3: make input/output iterators declare cursor ptr Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	c1d33eed41	freedreno/ir3: make foreach_ssa_src declar cursor ptr Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	65f604e3b3	freedreno/ir3: make foreach_src declare cursor ptr To match how the newer iterators work. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	599fd861d4	freedreno/ir3: be iterative It does pick up a few more cf/cp opportunities, according to sharder-db. But don't think it will be measurable. But this will allow some future simplification to cp by pulling out it's internal iteration. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	b828929ac9	freedreno/ir3: move where we preserve binning pass inputs For a6xx, since we use same VBO state for binning and VS, we need to preserve potentially unused inputs. This needs to be done before DCE. So move it before we add earlier DCE passes. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	d0cfc06a2c	freedreno/ir3: add IR3_PASS() macro Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	c9e5605720	freedreno/ir3/postsched: report progress Or do the easy thing and claim we always changed something. It is kinda hard and not worth the effort to determine for real. Also rip out unused error handling. This pass should never fail. And we weren't even actually checking the return. And while we're at it, switch over to taking the 'struct ir3 ir*` instead of ctx, to standardize with the other passes. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	c953794cd6	freedreno/ir3/legalize: report progress It always does something. Just return true for IR3_PASS() Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	c3630c9d29	freedreno/ir3/group: report progress Not iterative, but this will let IR3_PASS() macro know if there are any changes to print. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	721147a05d	freedreno/ir3/deps: report progress Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	e4ecfde2dd	freedreno/ir3/cp: report progress Later when we do this pass iteratively, we can drop some of the internal iteration and just rely on this pass getting run until there is no more progress. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	372e466301	freedreno/cf: report progress Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	b6d121502d	freedreno/ir3/dce: report progress Eventually we'll pull the iteration out of the pass itself, but the first step is to just report progress. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	9beb2baaff	freedreno/ir3: juggle around ir3_debug_print() In a later patch, this will get folded into an IR3_PASS() macro, at least for most passes. But to do that, it is better to standardize on printing the ir3 after the pass. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Rob Clark	947aa23eff	freedreno/ir3: remove Sethi-Ullman numbering pass We haven't used this for a while. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5048>	2020-05-19 16:06:17 +00:00
Ilia Mirkin	b5accb3ff9	freedreno/a3xx: parameterize ubo optimization A3xx apparently has higher alignment requirements than later gens for indirect const uploads. It also has fewer of them. Add compiler parameters for both settings, and set accordingly for a3xx and a4xx+. This fixes all the ubo test failures caused by this optimization. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5077>	2020-05-17 19:51:40 -04:00
Ilia Mirkin	9048adbd24	freedreno/ir3: avoid applying (sat) on bary.f This causes failures on a3xx resulting in the non-sensical dEQP failures on packUnorm2x16. The same test uses ldlv on a4xx+, so just disallow (sat) on bary.f on all generations. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5074>	2020-05-17 21:17:57 +00:00
Connor Abbott	2a9d12d513	ir3: Fixup dual-source blending slot The hardware expects that where MRT0 and MRT1 would normally go are the dual sources for MRT0, whereas GLSL has an extra "index" parameter that indicates which source it is. Remap it when handling FS outputs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5039>	2020-05-14 18:15:31 +00:00
Rob Clark	cf21b76383	freedreno/ir3: use lower_wrmasks pass Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2020-05-13 20:24:53 -07:00
Rob Clark	a506d49fae	nir: add helper to copy const_index[] It seems less brittle to not assume they are in the same order for src and dst instructions. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2020-05-13 20:24:45 -07:00
Rob Clark	ea6b404294	freedreno/ir3: use const_index accessors Cleans up a couple spots that were still open-coding this. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Eric Anholt <eric@anholt.net>	2020-05-13 20:24:38 -07:00
Kristian H. Kristensen	14969aab11	freedreno/ir3: Drop wrmask for ir3 local and global store intrinsics These intrinsics are supposed to map to the underlying hardware instructions, which don't have wrmask. We use them when we lower store_output in the geometry pipeline and since store_output gets lowered to temps, we always see full wrmasks there.	2020-05-13 20:24:33 -07:00
Eric Anholt	112c65825f	freedreno/a6xx: Use LDC for UBO loads. It saves addressing math, but may cause multiple loads to be done and bcseled due to NIR not giving us good address alignment information currently. I don't have any workloads I know of using non-const-uploaded UBOs, so I don't have perf numbers for it This makes us match the GLES blob's behavior, and turnip (other than being bindful). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	ab93a631b4	freedreno: Trim num_ubos to just the ones we haven't lowered to constbuf. With the upcoming LDC usage in the GL driver, we don't want to be uploading descriptors for every UBO when they aren't actually in use. Trimming NIR's num_ubos will avoid that, and cleans up num_ubo handling elsewhere right now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	d5176c453e	freedreno/ir3: Move i/o offset lowering after analyze_ubo_ranges. I found that when moving more UBOs to load_ubo_ir3, analyze_ubo_ranges would move things back in a broken way. We can just run this pass later and drop the _ir3 path. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	5387c27140	freedreno/ir3: Leave the cursor alone during ir3_nir_try_propagate_bit_shift. Otherwise, we might end up inserting the nir_intrinsic_load_ubo_ir3() after the non-offset src's definition, leading to nir_validate() failures. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Eric Anholt	e0a4d1c4e5	freedreno/ir3: Clean up a silly nir_src_for_ssa(src.ssa). Just copy the src through. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4858>	2020-05-14 00:10:43 +00:00
Rob Clark	d6706fdc46	freedreno/ir3/sched: try to avoid syncs Similar to what we do in postsched. It is useful for pre-RA sched to be a bit aware of things that would cause syncs. In particular for the tex fetches, since the vecN src/dst tends to limit postsched's ability to re-order them. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	d95a6e3a0c	freedreno/ir3/sched: avoid scheduling outputs If an instruction's only use is as an output, and it increases register pressure, then try to avoid scheduling it until there are no other options. A semi-common pattern is `fragcolN.a = 1.0`, this pushes all these immed loads to the end of the shader. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	488cf208d5	freedreno/ir3/postsched: try to avoid (sy) syncs Similar to avoidance of `(ss)` syncs, it turns out to be helpful to avoid `(sy)` syncs as well. This helps us turn an tex, (sy)alu, tex, (sy)alu sequence into tex, tex, (sy)alu, alu, which is a big win in gfxbench gl_fill2. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	25f4fb346e	freedreno/ir3/postsched: reset sfu_delay on sync Once we schedule an instruction that will require an `(ss)` sync flag, there is no need to delay any further instructions that consume an SFU result (until the next SFU instruction is scheduled). Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	f351e1d137	freedreno/ir3: limit # of tex prefetch by shader size It seems for short frag shaders, too much prefetch can be detrimental. I think what we really want to do is decide after pre-RA sched, when we also know about nop's and what the actual ir3 instruction count is. But that will require re-working how prefetch lowering works. For now this is a super crude heuristic to attempt to approximate a good solution. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4923>	2020-05-13 03:28:40 +00:00
Rob Clark	d69f6fd852	freedreno/ir3: fix indirect cb0 load_ubo lowering We can no longer assume that `state->ranges[0]` is block 0. It often is, but when we encounter a "real" ubo that we lower to `load_uniform` before a block 0 `load_ubo`, it could end up another entry in the table. Resulting in the second pass after gathering ubo ranges, not finding a valid range. Which results in a `load_ubo` for a thing that is not actually a ubo making it's way into ir3 frontend. Resulting in grabbing what we think is a ubo address out of some unrelated const register, and trying to dereference that. Which as you can imagine, fails in amusing ways. Fixes: `fc850080ee` ("ir3: Rewrite UBO push analysis to support bindless") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>	2020-05-12 23:51:46 +00:00
Rob Clark	c4dc877cb5	freedreno/ir3: don't allow negative const_offset Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4954>	2020-05-12 23:51:46 +00:00
Eric Anholt	f789c5975c	freedreno: Fix non-constbuf-upload UBO block indices and count. The nir_analyze_ubo_ranges pass removes all UBO block 0 loads to reverse what nir_lower_uniforms_to_ubo() had done, and we only upload UBO pointers to the HW for UBO block 1-N, so let's just fix up the shader state. Fixes an off by one in const state layout setup, and some really dodgy register addressing trying to deal with dynamic UBO indices when the UBO pointers happen to be at the start of the constbuf. There's no fixes tag, though this fixes a bug from September, because it would require the num_ubos fix in nir_lower_uniforms_to_ubo. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4992>	2020-05-12 17:01:55 +00:00
Eric Anholt	554b959df0	freedreno: Replace OUT_RELOCD with permanently flagging shader BOs for it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4967>	2020-05-12 16:30:57 +00:00
Hyunjun Ko	094c7646a3	freedreno,tu: Don't request fragcoord components not being read. v1. Replace the existed bool type with new bitfield and edit register files to take a mask instead of duplicating codes to do masking. v2. Use fragcoord_compmask != 0 instead of fragcoord_compmask > 0 since it represents a bitfield. Tested with dEQP-VK.glsl.builtin_var.simple.fragcoord_xyz/w dEQP-GLES2.functional.shaders.builtin_variable.fragcoord_xyz/w Closes: #2680 Signed-off-by: Hyunjun Ko <zzoon@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4723>	2020-05-08 17:45:03 +00:00
Eric Anholt	9a6bbf4c80	freedreno/ir3: Disable sin/cos range reduction for mediump. robclark noted that the blob wasn't doing range reduction in the mediump case, and I confirmed it on dEQP-GLES3.functional.shaders.operator.angle_and_trigonometry.sin.mediump_float_fragment vs dEQP-GLES3.functional.shaders.operator.angle_and_trigonometry.sin.highp_float_fragment. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4893>	2020-05-05 17:23:34 +00:00
Eric Anholt	5c81f51c3c	freedreno/ir3: Define the bindful uniform/nonuniform desc modes for cat6 a6xx. These come from the disasm tests, and fix our disasm of blob's uniform/nonuniform cat6 operands. We also now include human-readable names for all the modes we know about (though bindless gets distinguished by its .baseN, like Connor's original disasm). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>	2020-05-04 11:15:50 -07:00
Eric Anholt	97b21110b8	freedreno/ir3: Sync some new changes from envytools. With this I also brought in a few new control flow instruction disasm tests that I'd made back when I wrote the disasm test, but which were too far from correct to include until now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>	2020-05-04 11:14:46 -07:00
Eric Anholt	1e5b0c92c5	freedreno/ir3: Add some more tests of cat6 disasm. I put these together from traces I had while trying to do LDC for GL. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4857>	2020-05-04 11:14:46 -07:00
Eric Anholt	29f58cfbd0	freedreno/ir3: Set up outputs for multi-slot varyings. Necessary to avoid compiler assertion failures in: dEQP-GLES31.functional.program_interface_query.program_output.type.interface_blocks.out.named_block_explicit_location.struct.mat3x2 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	88dcfaf0ee	freedreno/ir3: Stop initializing regid of so->outputs during setup. It's unused and overwritten by ir3_compile_shader_nir(). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	8c1c218909	freedreno/ir3: Improve shader key normalization. We can remove a bunch of conditional code at key comparison time by computing a bitmask of used key bits at ir3_shader creation time. This also gives us a nice place to put additional key simplification to reduce how many variants we create (like skipping rastflat if we don't read colors in the FS, or skipping vclamp_color if we don't write colors). It does mean walking the whole key to AND it, but the key is just 28 bytes so far so that seems pretty fine. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	6f1e3235f2	freedreno: Emit debug messages when doing draw-time recompiles of shaders. Right now that's "always" unless you have shaderdb set. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	a361567c46	freedreno/ir3: Remove unused half precision shader key flag. The code using it was removed in `4af86bd0b9` ("freedreno/ir3: remove half-precision output") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	05be0659fe	freedreno: Fix assertion failures on GS/tess shaders with shader-db enabled. We weren't filling in the tess mode of the key, or setting has_gs on GS shaders, resulting in assertion failures when NIR intrinsics didn't get lowered. We have to make a guess at prim mode for TCS, but it should be better to have some shader-db coverage than none, and it will avoid these failures happening when we start precompiling shaders. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	f91e49ee29	freedreno/ir3: Skip tess epilogue if the program is missing stores. Some of the negative API tests make shaders for tess stages that don't do all the stores they need to. Once we start precompiling (or doing shader-db of tess), we need to at least not segfault when generating them. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Eric Anholt	b420d04e1f	freedreno/ir3: Fix register allocation assertion failures. We were failing to tell the allocator about the restriction that scalar texture instructions (allocated as scalar regs) couldn't be allocated such that the start of the full unwritemasked vector started before r0. There was a patch in select_reg_callback on a6xx that tried to work around that, but you could still end up backed into a corner you shouldn't be because we didn't tell the RA what it needed. Fixes compiler assertion failures on a300-a400's blit_z shader, used for Z32F gmem blits. Looks like as a result we get tighter register allocation but more nops: instructions in affected programs: 757945 -> 760356 (0.32%) nops in affected programs: 317983 -> 320468 (0.78%) non-nops in affected programs: 27525 -> 27451 (-0.27%) mov in affected programs: 3098 -> 3023 (-2.42%) dwords in affected programs: 109664 -> 110656 (0.90%) last-baryf in affected programs: 112701 -> 112847 (0.13%) full in affected programs: 4326 -> 4011 (-7.28%) sstall in affected programs: 120550 -> 120836 (0.24%) (ss) in affected programs: 13939 -> 13918 (-0.15%) (sy) in affected programs: 3006 -> 2786 (-7.32%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:32 +00:00
Kristian H. Kristensen	73f34e0d46	freedreno/ir3: Drop hack to clean up split vars When the GS lowering was working on store_output intrinsics, we had to clean up the split vars to avoid getting confused. Now that we shadow the output vars instead, there's no confusion and we can drop this hack. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:31 +00:00
Kristian H. Kristensen	dd8d257a30	freedreno/ir3: Lower GS builtins before lowering IO We mostly got away with replacing a store_output with a store_var, but for complex types like structs, that doesn't work. Once the IO has been lowered from vars to intrinsic, we've lost the deref chains and can't properly shadow the outputs. This commits moves the GS lowering up so we do it before the output variables get lowered to store_output. This way the pass works much like nir_lower_io_to_temporaries() and cleanly shadows the outputs. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:31 +00:00
Kristian H. Kristensen	79355fd901	freedreno/ir3: Add ir3_nir_lower_to_explicit_input() pass This pass lowers per-vertex input intrinsics to load_shared_ir3. This was open coded in the TCS and GS lowering passes before - this way we can share it. Furthermore, we'll need to run the rest of the GS lowering earlier (before lowering IO) so we need to split off this part that operates on the IO intrinsics first. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4562>	2020-05-01 16:26:31 +00:00

1 2 3 4 5 ...

609 Commits