mirrors/mesa - Frog Git

Commit Graph

Author	SHA1	Message	Date
Alyssa Rosenzweig	7f6491b76d	nir: Combine if_uses with instruction uses Every nir_ssa_def is part of a chain of uses, implemented with doubly linked lists. That means each requires 2 * 64-bit = 16 bytes per def, which is memory intensive. Together they require 32 bytes per def. Not cool. To cut that memory use in half, we can combine the two linked lists into a single use list that contains both regular instruction uses and if-uses. To do this, we augment the nir_src with a boolean "is_if", and reimplement the abstract if-uses operations on top of that list. That boolean should fit into the padding already in nir_src so should not actually affect memory use, and in the future we sneak it into the bottom bit of a pointer. However, this creates a new inefficiency: now iterating over regular uses separate from if-uses is (nominally) more expensive. It turns out virtually every caller of nir_foreach_if_use(_safe) also calls nir_foreach_use(_safe) immediately before, so we rewrite most of the callers to instead call a new single `nir_foreach_use_including_if(_safe)` which predicates the logic based on `src->is_if`. This should mitigate the performance difference. There's a bit of churn, but this is largely a mechanical set of changes. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22343>	2023-04-07 23:48:03 +00:00
Alyssa Rosenzweig	b9cc2b2a98	pan/{mdg,bi}: Always use sampler 0 for txf Now that we upload workaround samplers for txf, sampler 0 is guaranteed to be valid but other samplers are not. So ignore whatever the current sampler_index value is (it's formally undefined in NIR) and use 0, which we know is valid. We already do this on Valhall for OpenCL, just need to generalize for Midgard and Bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Italo Nicola <italonicola@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22223>	2023-04-07 01:15:41 +00:00
Emma Anholt	3f2328c629	panfrost/midgard: Enable nir_lower_frexp. Needed for dropping the GLSL frontend lowering. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22083>	2023-04-06 02:32:01 +00:00
Emma Anholt	2a33ea95d6	glsl: Retire ldexp lowering in favor of the nir lowering flag. Compilers need to set the nir flag anyway for vulkan, so just pass ldexp through to NIR and let that handle it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/22083>	2023-04-06 02:32:00 +00:00
Alyssa Rosenzweig	ffb9919c2f	panfrost: Lower sysvals in GL Drop the backend compiler sysval handling in favour of the pass in the GL driver, bringing us into compliance with Ekstrand's rule. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>	2023-03-23 23:53:46 +00:00
Alyssa Rosenzweig	c65a9be421	panfrost: Preprocess shaders at CSO create time Now the only passes that depend on the shader key can run late, so we can preprocess ahead-of-time once and throw away the original shader. This reduces the cost of shader variants, as well as deduplicates some lowering for transform feedback shaders. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>	2023-03-23 23:53:46 +00:00
Alyssa Rosenzweig	2745daa05a	pan/lower_framebuffer: Lower MSAA blend shaders Do it explicitly in NIR rather than implicitly in the Midgard compiler. This avoids a nasty sideband input for the render target formats and sample count, for blend shaders on midgard only. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>	2023-03-23 23:53:46 +00:00
Alyssa Rosenzweig	ca2042f359	panfrost: Preprocess shaders in the driver This is a flag-day change to how we compile. We split preprocessing NIR into a separate step from compiling, giving the driver a chance to apply its own lowerings on the preprocessed NIR before the final optimization loop. During that time, the different producers of NIR (panfrost, panvk, blend shaders, blit shaders...) will be able to (differently) lower system values. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>	2023-03-23 23:53:46 +00:00
Alyssa Rosenzweig	8059eb1577	pan/lower_framebuffer: Only call for FS It doesn't make sense for shader stages other than fragment (and blend which is fragment-like), assert this. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>	2023-03-23 23:53:46 +00:00
Alyssa Rosenzweig	ee2a5d6bc6	pan/mdg: Split out early preprocessing from late To prepare for the new compile flow, where this will be called by the driver instead of internally in the compiler. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>	2023-03-23 23:53:46 +00:00
Alyssa Rosenzweig	924f68fe4b	pan/mdg: Only lower once Nothing in the optimization loop should remat the lowered instructions, so there's no need to do it inside the loop. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>	2023-03-23 23:53:46 +00:00
Alyssa Rosenzweig	edf24f1887	pan/mdg: Use I/O semantics for MRT blend stores This avoids the silly reliance on the sideband. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20906>	2023-03-23 23:53:45 +00:00
Alyssa Rosenzweig	b190d08a8a	pan/mdg: Remove reference to removed macro This will soon be more confusing than helpful. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20445>	2023-03-11 06:30:02 +00:00
Alyssa Rosenzweig	fc93e8e537	pan/mdg: Drop control_barrier handling Now unreachable. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21634>	2023-03-07 00:41:13 +00:00
Alyssa Rosenzweig	1d2c1b8bd6	pan/mdg: Use nir_lower_helper_writes It's now in common code, drop our (buggier) copy. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21413>	2023-03-04 13:31:05 -05:00
Alyssa Rosenzweig	586da7b329	nir: Add nir_lower_helper_writes pass This NIR pass lowers stores in fragment shaders to: if (!gl_HelperInvocaton) { store(); } This implements the API requirement that helper invocations do not have visible side effects, and the lowering is required on any hardware that cannot directly mask helper invocation's side effects. The pass was originally written for Midgard (which has this issue) but is also needed for Asahi. Let's share the code, and fix it while we're at it. Changes from the Midgard pass: 1. Add an option to only lower atomics. AGX hardware can mask helper invocations for "plain" stores but not for atomics. Accordingly, the AGX compiler wants this lowering for atomics but not store_global. By contrast, Midgard cannot mask any stores and needs the lowering for all store intrinsics. Add an option to the common pass to accommodate both cases. This is an optimization for AGX. It is not required for correctness, this lowering is always legal. 2. Fix dominance issues. It's invalid to have NIR like if ... { ssa_1 = ... } foo ssa_1 Instead we need to rewrite as if ... { ssa_1 = ... } else { ssa_2 = undef } ssa_3 = phi ssa_1, ssa_2 foo ssa_3 By default, neither nir_validate nor the backends check this, so this doesn't currently fix a (known) real bug. But it's still invalid and fails validation with NIR_DEBUG=validate_ssa_dominance. Fix this in lower_helper_writes for intrinsics that return data (atomics). 3. Assert that the pass is run only for fragment shaders. This encourages backends to be judicious about which passes they call instead of just throwing everything in a giant lower everything spaghetti. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Italo Nicola <italonicola@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21413>	2023-03-04 13:31:05 -05:00
Emma Anholt	f16a23aa9d	panfrost/midgard: Drop redundant arg to emit_explicit_constant. Every caller passed the same value twice. Just reuse it? Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21476>	2023-02-28 06:13:05 +00:00
Emma Anholt	63aa5909b4	panfrost/midgard: Fix handling of csel with a vector constant condition. If it's not all true or all false, then you'll have a csel with a vector constant, and the backend failed to translate appropriately. Expand the constant to fix it. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21476>	2023-02-28 06:13:05 +00:00
Caio Oliveira	91fa939763	panfrost: Use NIR scoped barriers instead of memory barriers Now both GLSL and SPIR-V will produce the scoped barriers, so no need to handle the old ones. Control barriers are still present in some cases, so keep that for now. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3339>	2023-02-27 20:24:01 +00:00
Caio Oliveira	901bc6d53c	pan/midgard: Handle nir_intrinsic_scoped_barrier in Midgard compiler Behave the same as the existing more specific barrier intrinsics. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3339>	2023-02-27 20:24:01 +00:00
Alyssa Rosenzweig	44bdcb7214	panfrost: Use proper locations in blend shaders Rather than always blending to FRAG_RESULT_DATA0. This removes silly special cases in the compiler. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/21426>	2023-02-26 17:35:07 -05:00
Daniel Schürmann	2bb369dd8d	nir: add assertions that loops don't have a Continue Construct Hoping that I didn't miss any, this should add assertions to all functions and passes which explicitly handle 'nir_loop'. Acked-by: Faith Ekstrand <faith.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13962>	2023-02-21 10:41:11 +00:00
Alyssa Rosenzweig	63f30802eb	pan/lower_framebuffer: Operate on lowered I/O This turns the early pass into a late pass, which is important because it depends on the shader key and therefore should be called by the driver instead of the compiler preprocessing. It's also simpler this way. The shader key work is waiting for review in another merge request. In the mean time, this patch will let us run blend lowering early for blend shaders on Midgard. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20836>	2023-02-17 23:09:19 +00:00
Ian Romanick	ea413e826b	nir: Eliminate nir_op_f2b Builds on the work of !15121. This gets to delete even more code because many drivers shared a lot of code for i2b and f2b. No shader-db or fossil-db changes on any Intel platform. v2: Rebase on `1a35acd8d9`. v3: Update a comment in nir_opcodes_c.py. Suggested by Konstantin. v4: Another rebase. Remove f2b stuff from Midgard. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20509>	2023-02-03 22:39:57 +00:00
Alyssa Rosenzweig	f02354d3e2	pan/mdg: Remove MSGS debug These should all be unreachable and what's left is dead-code. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>	2023-01-16 22:20:43 +00:00
Alyssa Rosenzweig	23968aeeb5	pan/mdg: Scalarize LUT instructions in NIR Simpler. Small shaderdb regressions from using IR registers instead of SSA, but that's probably what we needed for correctness (given that SSA is violated otherwise) hence the Cc. total instructions in shared programs: 1520220 -> 1518127 (-0.14%) instructions in affected programs: 167437 -> 165344 (-1.25%) helped: 662 HURT: 206 helped stats (abs) min: 1.0 max: 46.0 x̄: 3.65 x̃: 2 helped stats (rel) min: 0.18% max: 22.22% x̄: 2.43% x̃: 1.71% HURT stats (abs) min: 1.0 max: 7.0 x̄: 1.56 x̃: 1 HURT stats (rel) min: 0.17% max: 8.33% x̄: 2.66% x̃: 2.33% 95% mean confidence interval for instructions value: -2.65 -2.18 95% mean confidence interval for instructions %-change: -1.45% -0.99% Instructions are helped. total bundles in shared programs: 649844 -> 649345 (-0.08%) bundles in affected programs: 59278 -> 58779 (-0.84%) helped: 577 HURT: 249 helped stats (abs) min: 1.0 max: 39.0 x̄: 1.56 x̃: 1 helped stats (rel) min: 0.26% max: 30.00% x̄: 3.13% x̃: 2.19% HURT stats (abs) min: 1.0 max: 12.0 x̄: 1.61 x̃: 1 HURT stats (rel) min: 0.58% max: 25.00% x̄: 5.25% x̃: 4.00% 95% mean confidence interval for bundles value: -0.78 -0.43 95% mean confidence interval for bundles %-change: -0.98% -0.23% Bundles are helped. total quadwords in shared programs: 1136767 -> 1134956 (-0.16%) quadwords in affected programs: 141780 -> 139969 (-1.28%) helped: 744 HURT: 311 helped stats (abs) min: 1.0 max: 9.0 x̄: 3.13 x̃: 2 helped stats (rel) min: 0.14% max: 26.67% x̄: 2.77% x̃: 2.13% HURT stats (abs) min: 1.0 max: 8.0 x̄: 1.68 x̃: 1 HURT stats (rel) min: 0.35% max: 10.00% x̄: 3.17% x̃: 1.69% 95% mean confidence interval for quadwords value: -1.89 -1.54 95% mean confidence interval for quadwords %-change: -1.27% -0.77% Quadwords are helped. total registers in shared programs: 90461 -> 90273 (-0.21%) registers in affected programs: 2833 -> 2645 (-6.64%) helped: 250 HURT: 82 helped stats (abs) min: 1.0 max: 2.0 x̄: 1.08 x̃: 1 helped stats (rel) min: 6.67% max: 33.33% x̄: 14.06% x̃: 12.50% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 6.67% max: 50.00% x̄: 13.90% x̃: 12.50% 95% mean confidence interval for registers value: -0.67 -0.47 95% mean confidence interval for registers %-change: -8.62% -5.69% Registers are helped. total threads in shared programs: 55685 -> 55686 (<.01%) threads in affected programs: 76 -> 77 (1.32%) helped: 20 HURT: 17 helped stats (abs) min: 1.0 max: 2.0 x̄: 1.30 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.47 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: -0.47 0.52 95% mean confidence interval for threads %-change: 5.81% 56.35% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 1387 -> 1379 (-0.58%) spills in affected programs: 283 -> 275 (-2.83%) helped: 5 HURT: 1 total fills in shared programs: 5256 -> 5176 (-1.52%) fills in affected programs: 557 -> 477 (-14.36%) helped: 5 HURT: 1 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>	2023-01-16 22:20:43 +00:00
Alyssa Rosenzweig	10759d1708	pan/mdg: Use special NIR ops for trig scaling Otherwise the lowering is fundamentally unsound due to incorrect constant folding, even though it worked by chance with the old pass ordering. We're about to change slightly the way we handle fsin/fcos, which was enough to trigger this unsoundness. shader-db results are mostly a toss-up. total instructions in shared programs: 1520675 -> 1520220 (-0.03%) instructions in affected programs: 96841 -> 96386 (-0.47%) helped: 397 HURT: 3 helped stats (abs) min: 1.0 max: 4.0 x̄: 1.15 x̃: 1 helped stats (rel) min: 0.22% max: 6.25% x̄: 1.15% x̃: 0.40% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.58% max: 2.08% x̄: 1.08% x̃: 0.58% 95% mean confidence interval for instructions value: -1.19 -1.08 95% mean confidence interval for instructions %-change: -1.26% -1.01% Instructions are helped. total bundles in shared programs: 650088 -> 649844 (-0.04%) bundles in affected programs: 31132 -> 30888 (-0.78%) helped: 229 HURT: 23 helped stats (abs) min: 1.0 max: 4.0 x̄: 1.21 x̃: 1 helped stats (rel) min: 0.49% max: 7.14% x̄: 1.28% x̃: 0.71% HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.48 x̃: 1 HURT stats (rel) min: 0.83% max: 8.33% x̄: 2.38% x̃: 1.85% 95% mean confidence interval for bundles value: -1.08 -0.86 95% mean confidence interval for bundles %-change: -1.15% -0.74% Bundles are helped. total quadwords in shared programs: 1137388 -> 1136767 (-0.05%) quadwords in affected programs: 71826 -> 71205 (-0.86%) helped: 367 HURT: 17 helped stats (abs) min: 1.0 max: 8.0 x̄: 1.80 x̃: 1 helped stats (rel) min: 0.31% max: 17.24% x̄: 2.27% x̃: 0.96% HURT stats (abs) min: 1.0 max: 6.0 x̄: 2.29 x̃: 2 HURT stats (rel) min: 0.44% max: 11.11% x̄: 2.18% x̃: 1.47% 95% mean confidence interval for quadwords value: -1.76 -1.47 95% mean confidence interval for quadwords %-change: -2.36% -1.78% Quadwords are helped. total registers in shared programs: 90483 -> 90461 (-0.02%) registers in affected programs: 890 -> 868 (-2.47%) helped: 67 HURT: 44 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 8.33% max: 25.00% x̄: 10.52% x̃: 9.09% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.02 x̃: 1 HURT stats (rel) min: 9.09% max: 50.00% x̄: 31.15% x̃: 33.33% 95% mean confidence interval for registers value: -0.39 -0.01 95% mean confidence interval for registers %-change: 1.75% 10.25% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total threads in shared programs: 55694 -> 55685 (-0.02%) threads in affected programs: 21 -> 12 (-42.86%) helped: 1 HURT: 5 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: -2.79 -0.21 95% mean confidence interval for threads %-change: -89.26% 39.26% Inconclusive result (%-change mean confidence interval includes 0). Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19350>	2023-01-16 22:20:43 +00:00
Alyssa Rosenzweig	0afd691f29	panfrost: clang-format the tree This switches us over to Mesa's code style [1], normalizing us within the tree. The results aren't perfect, but they bring us a hell of a lot closer to the rest of the tree. Panfrost doesn't feel so foreign relative to Mesa with this, which I think (in retrospect after a bunch of years of being "different") is the right call. I skipped PanVK because that's paused right now. find panfrost/ -type f -name '.h' \| grep -v vulkan \| xargs clang-format -i; find panfrost/ -type f -name '.c' \| grep -v vulkan \| xargs clang-format -i; clang-format -i gallium/drivers/panfrost/.c gallium/drivers/panfrost/.h ; find panfrost/ -type f -name '*.cpp' \| grep -v vulkan \| xargs clang-format -i [1] https://docs.mesa3d.org/codingstyle.html Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20425>	2022-12-24 02:22:57 +00:00
Alyssa Rosenzweig	a4705afe63	panfrost: Fix up some formatting for clang-format clang-format will make a mess of these otherwise. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20425>	2022-12-24 02:22:57 +00:00
Alyssa Rosenzweig	e35719be6f	panfrost: Add missing #includes Found shuffling headers with clang format. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20425>	2022-12-24 02:22:57 +00:00
Alyssa Rosenzweig	8dd35e0ac7	pan/mdg: Remove unused disassembler functions Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/20420>	2022-12-23 16:27:16 +00:00
Ian Romanick	eb76cee9f8	nir: Eliminate nir_op_i2b There are a lot of optimizations in opt_algebraic that match ('ine', a, 0), but there are almost none that match i2b. Instead of adding a huge pile of additional patterns (including variations that include both ine and i2b), always lower i2b to a != 0. At this point in the series, it should be impossible for anything to generate i2b, so there /should not/ be any changes. The failing test on d3d12 is a pre-existing bug that is triggered by this change. I talked to Jesse about it, and, after some analysis, he suggested just adding it to the list of known failures. v2: Don't rematerialize i2b instructions in dxil_nir_lower_x2b. v3: Don't rematerialize i2b instructions in zink_nir_algebraic.py. v4: Fix zink-on-TGL CI failures by calling nir_opt_algebraic after nir_lower_doubles makes progress. The latter can generate b2i instructions, but nir_lower_int64 can't handle them (anymore). v5: Add back most of the hunk at line 2125 of nir_opt_algebraic.py. I had accidentally removed the f2b(bf2(x)) optimization. v6: Just eliminate the i2b instruction. v7: Remove missed i2b32 in midgard_compile.c. Remove (now unused) emit_alu_i2orf2_b1 function from sfn_instr_alu.cpp. Previously this function was still used. 🤷 No shader-db changes on any Intel platform. All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 141165875 -> 141165873 (-0.0%) Instructions helped: 2 Cycles in all programs: 9098956382 -> 9098956350 (-0.0%) Cycles helped: 2 The two Vulkan shaders are helped because of the "new" (('b2i32', ('ine', ('ubfe', a, b, 1), 0)), ('ubfe', a, b, 1)) algebraic pattern. Acked-by: Jesse Natalie <jenatali@microsoft.com> [earlier version] Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Tested-by: Daniel Schürmann <daniel@schuermann.dev> [earlier version] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15121>	2022-12-14 06:23:21 +00:00
Alyssa Rosenzweig	976405907e	pan/mdg: Emulate 8-bit with the 16-bit pipe We don't care to support i8vec16, we just need a bit of 8-bit support to implement format packing/unpacking in blend shaders. We're already doing this by using the 16-bit pipe, we just need to commit to it all the way -- reporting the correct sizes in max_bitsize_for_alu so the mask packing logic works as intended -- and dropping the imov-specific hack that was introduced to workaround a similar class of bugs. With the previous patch, fixes: dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.1 Fixes: `39e4b7279d` ("pan/midg: Fix swizzling on 8-bit sources") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19763>	2022-12-01 00:52:53 +00:00
Alyssa Rosenzweig	261d48fc9b	pan/mdg: Refuse to schedule CSEL.vector to SMUL Even if we only mask a single component from the result of CSEL.vector, in our IR we treat its semantics as vector which causes trouble with when scheduled to a scalar unit. The problematic bundle looks like this: vmul.MOV.i32 R31, TMP0.xxxx, R0.yzww sadd.MAX.i32 TMP0.y, R0.y, #65408 smul.CSEL.vector.i32 R0.y, TMP0.y, #127 As the comment in midgard.h illuminates, these CSEL instructions are actually operating per-bit, lining up with the all-1's booleans in Midgard. The Bifrost analogue is MUX.i32.bit, not CSEL.i32. We should probably rename the Midgard instruction to make that clear. Anyhoo, on the scalar unit, CSEL/MUX operates on the bottom 32-bits of its source. That's ok for the usual r31.w case, because that's secretly replicating to its nonexistent register, I think? But that doesn't work with the CSEL.vector (MUX.vector) form, because the condition it's actually muxing on is r31.x, which here is R0.y, not the intended R0.x. Rather than adding more special cases to the already overcomplicated scheduler (for the dubious benefit of avoiding a small shaderdb regression), just avoid scheduling CSEL.vector to smul. With the next patch, fixes: dEQP-GLES31.functional.draw_buffers_indexed.random.max_required_draw_buffers.1 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19763>	2022-12-01 00:52:53 +00:00
Alyssa Rosenzweig	044428211c	pan/mdg: Fix out-of-order execution We can go up to 15 instructions out of order (performance fix) but we can't go past a branch (bug fix). Fixes: `30a393f458` ("pan/mdg: Enable out-of-order execution after texture ops") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19762>	2022-11-23 20:23:50 +00:00
Yonggang Luo	40a9fc57aa	tree-wide: Use __func__ instead of __FUNCTION__ in non-gallium code Signed-off-by: Yonggang Luo <luoyonggang@gmail.com> Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Acked-by: David Heidelberg <david.heidelberg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19861>	2022-11-22 06:53:46 +00:00
M Henning	f3ee9be836	glsl: Drop borrow/carry lowerings in favor of nir Unconditionally lowering prevents GL drivers from natively implementing these ops. Drivers that need lowering should set lower_uadd_carry and lower_usub_borrow on nir_shader_compiler_options to get the nir lowerings. Tested with dEQP-GLES31.functional.shaders.builtin_functions.integer.* Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19704>	2022-11-15 21:51:04 +00:00
Alyssa Rosenzweig	2316b80d77	panfrost: Don't use nir_variable to link varyings NIR deemphasizes nir_variable. We want to transition off it. Instead of walking the list of variables and playing games with the GLSL types to collect varying information, walk the list of instructions and use the I/O semantics to collect similar information. In addition to avoiding the reliance on nir_variable, this fixes handling of struct varyings under certain circumstances. Such programs are compiled by the GLES3.1 CTS but not used, so without this fix, the affected tests would regress when precompiling. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>	2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig	d0281fc16a	pan/mdg: Use bifrost_nir_lower_store_component Move the pass from the Bifrost compiler to the Midgard/Bifrost common code directory, and take advantage of it on Midgard, where it fixes the same tests as it fixed originally on Bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>	2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig	17589be72b	pan/mdg: Use .u32 for flat shading This is simple and matches what we do on Bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>	2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig	225a8f6e27	pan/mdg: Don't pair ST_VARY.a32 with other instrs For some reason, LD_ATTR/ST_VARY.a32 bundles raise INSTR_INVALID_ENC, at least on Mali-T860. Don't construct such pairs. This is a blunt hack but I don't know where this curveball requirement is coming from and this unblocks the rest of this series. total instructions in shared programs: 99879 -> 99788 (-0.09%) instructions in affected programs: 3179 -> 3088 (-2.86%) helped: 49 HURT: 9 helped stats (abs) min: 1.0 max: 6.0 x̄: 2.04 x̃: 2 helped stats (rel) min: 0.93% max: 10.53% x̄: 5.46% x̃: 4.88% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.61% max: 2.13% x̄: 1.41% x̃: 1.14% 95% mean confidence interval for instructions value: -1.93 -1.20 95% mean confidence interval for instructions %-change: -5.37% -3.41% Instructions are helped. total bundles in shared programs: 43778 -> 45102 (3.02%) bundles in affected programs: 10737 -> 12061 (12.33%) helped: 10 HURT: 369 helped stats (abs) min: 1.0 max: 3.0 x̄: 1.50 x̃: 1 helped stats (rel) min: 2.90% max: 18.75% x̄: 6.93% x̃: 5.21% HURT stats (abs) min: 1.0 max: 10.0 x̄: 3.63 x̃: 4 HURT stats (rel) min: 0.82% max: 44.44% x̄: 15.27% x̃: 13.33% 95% mean confidence interval for bundles value: 3.29 3.69 95% mean confidence interval for bundles %-change: 13.68% 15.69% Bundles are HURT. total quadwords in shared programs: 76783 -> 77914 (1.47%) quadwords in affected programs: 18633 -> 19764 (6.07%) helped: 9 HURT: 370 helped stats (abs) min: 1.0 max: 2.0 x̄: 1.22 x̃: 1 helped stats (rel) min: 0.87% max: 8.33% x̄: 3.71% x̃: 3.85% HURT stats (abs) min: 1.0 max: 7.0 x̄: 3.09 x̃: 3 HURT stats (rel) min: 0.82% max: 35.00% x̄: 7.82% x̃: 6.11% 95% mean confidence interval for quadwords value: 2.82 3.15 95% mean confidence interval for quadwords %-change: 7.02% 8.06% Quadwords are HURT. total registers in shared programs: 7266 -> 7076 (-2.61%) registers in affected programs: 1224 -> 1034 (-15.52%) helped: 171 HURT: 25 helped stats (abs) min: 1.0 max: 3.0 x̄: 1.27 x̃: 1 helped stats (rel) min: 8.33% max: 50.00% x̄: 21.85% x̃: 20.00% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.12 x̃: 1 HURT stats (rel) min: 10.00% max: 100.00% x̄: 35.73% x̃: 33.33% 95% mean confidence interval for registers value: -1.10 -0.84 95% mean confidence interval for registers %-change: -17.69% -11.32% Registers are helped. total threads in shared programs: 4956 -> 5019 (1.27%) threads in affected programs: 99 -> 162 (63.64%) helped: 43 HURT: 6 helped stats (abs) min: 1.0 max: 2.0 x̄: 1.74 x̃: 2 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 2.0 max: 2.0 x̄: 2.00 x̃: 2 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: 0.91 1.66 95% mean confidence interval for threads %-change: 67.36% 95.90% Threads are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>	2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig	e04156b42a	pan/mdg: Disassemble the .a32 bit Corresponds to .auto32 on Bifrost. This is helpful for a conformant implementation of flat shading. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19363>	2022-11-02 16:52:11 +00:00
Alyssa Rosenzweig	2a6338722e	panfrost: Don't use nir_variable in the compilers More future proof, simpler, and works with early I/O lowering. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19456>	2022-11-02 04:22:06 +00:00
Alyssa Rosenzweig	78785f3b18	pan/mdg: Don't schedule across memory barrier Fixes KHR-GLES31.core.shader_image_load_store.basic-glsl-misc-cs Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19238>	2022-10-27 20:13:11 +00:00
Alyssa Rosenzweig	0955fe8fe2	panfrost: Use compute-based XFB on Midgard Now we're back to a single XFB implementation for all gens. Fixes: KHR-GLES31.core.draw_indirect.advanced-twoPasses-transformFeedback-arrays KHR-GLES31.core.draw_indirect.advanced-twoPasses-transformFeedback-elements Cc: mesa-stable Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19238>	2022-10-27 20:13:11 +00:00
Alyssa Rosenzweig	9e2ce225e6	pan/mdg: Fix 64-bit address arithmetic Cc: mesa-stable Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19238>	2022-10-27 20:13:11 +00:00
Alyssa Rosenzweig	21a4dbb720	panfrost: Don't use lower_wpos_pntc on Midgard gl_PointCoord is implemented via a special attribute descriptor on Midgard. This descriptor has an orientation bit, the orientation is driver-controlled. That means we can map rast->sprite_coord_mode to this bit, rather than lowering in the shader. This is a bug fix for point sprites, which are implemented natively on Midgard for dubious reasons and need to be flipped this way. It is also an optimization for apps reading gl_PointCoord, removing the extra arithmetic to flip, although the value of this is somewhat dubious. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19237>	2022-10-26 01:56:08 +00:00
Alyssa Rosenzweig	829f769e60	pan/mdg: Fix 16-bit alignment with spiller The loop over sources has to happen for every instruction, regardless of whether we also need to register allocate the destination. The other source loops handle this properly, but this one was missed. Fixes spilling failure in shaders/android/angle/aztec_ruins/16.shader_test when the input NIR is shuffled a bit (from reordering passes). Fixes: `129d390bd8` ("pan/mdg: Fix bound setting in RA for sources") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19093>	2022-10-17 19:11:10 +00:00
Alyssa Rosenzweig	2c446b6636	pan/mdg: Limit work registers for large workgroups When more than 8 registers are used, Midgard can only fit 64 threads in a thread group. For barriers to work properly, a threadgroup must fit an entire work group. The GL driver configures the hardware to have threadgroups the size of work groups. That means if more than 64 threads are used in a workgroup, and more than 8 registers are used, the hardware will fault spawning threads. To workaround this hardware limitation, we need to limit the number of work registers used depending on the size of the workgroup. Typically, the work group size is known at compile-time so that determination can usually be made without variants. To avoid variants, we make a pessimistic estimate in the case when it's not known at compile-time. shader-db shows 6 shaders affected. I expect that all of these would fault with DATA_INVALID_FAULT if they tried to execute before this patch, due to the oversize local size, and faulting is even slower than spilling ;-) Fixes dEQP-GLES31.functional.synchronization.* on Mali-T860. instructions HURT: shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 121 -> 157 (29.75%) instructions HURT: shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 121 -> 157 (29.75%) instructions HURT: shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 141 -> 184 (30.50%) instructions HURT: shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 141 -> 184 (30.50%) instructions HURT: shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 513 -> 933 (81.87%) instructions HURT: shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 505 -> 1002 (98.42%) bundles HURT: shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 73 -> 116 (58.90%) bundles HURT: shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 73 -> 116 (58.90%) bundles HURT: shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 61 -> 97 (59.02%) bundles HURT: shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 61 -> 97 (59.02%) bundles HURT: shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 281 -> 701 (149.47%) bundles HURT: shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 278 -> 775 (178.78%) registers helped: shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 11 -> 8 (-27.27%) registers helped: shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 11 -> 8 (-27.27%) registers helped: shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 14 -> 8 (-42.86%) registers helped: shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 14 -> 8 (-42.86%) registers helped: shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 16 -> 8 (-50.00%) registers helped: shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 16 -> 8 (-50.00%) threads helped: shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%) threads helped: shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%) threads helped: shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%) threads helped: shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%) threads helped: shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%) threads helped: shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 1 -> 2 (100.00%) spills HURT: shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 0 -> 5 spills HURT: shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 0 -> 5 spills HURT: shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 0 -> 8 spills HURT: shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 0 -> 8 spills HURT: shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 0 -> 112 spills HURT: shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 0 -> 146 fills HURT: shaders/android/gfxbench/carchase/6.shader_test MESA_SHADER_COMPUTE: 0 -> 26 fills HURT: shaders/android/gfxbench/carchase/386.shader_test MESA_SHADER_COMPUTE: 0 -> 26 fills HURT: shaders/android/gfxbench/carchase/374.shader_test MESA_SHADER_COMPUTE: 0 -> 33 fills HURT: shaders/android/gfxbench/carchase/4-1.shader_test MESA_SHADER_COMPUTE: 0 -> 33 fills HURT: shaders/android/com.miHoYo.GenshinImpact/18.shader_test MESA_SHADER_COMPUTE: 0 -> 209 fills HURT: shaders/android/com.miHoYo.GenshinImpact/16.shader_test MESA_SHADER_COMPUTE: 0 -> 234 total instructions in shared programs: 1521691 -> 1522766 (0.07%) instructions in affected programs: 1542 -> 2617 (69.71%) helped: 0 HURT: 6 HURT stats (abs) min: 36.0 max: 497.0 x̄: 179.17 x̃: 43 HURT stats (rel) min: 29.75% max: 98.42% x̄: 50.13% x̃: 30.50% 95% mean confidence interval for instructions value: -49.36 407.69 95% mean confidence interval for instructions %-change: 17.14% 83.12% Inconclusive result (value mean confidence interval includes 0). total bundles in shared programs: 649296 -> 650371 (0.17%) bundles in affected programs: 827 -> 1902 (129.99%) helped: 0 HURT: 6 HURT stats (abs) min: 36.0 max: 497.0 x̄: 179.17 x̃: 43 HURT stats (rel) min: 58.90% max: 178.78% x̄: 94.01% x̃: 59.02% 95% mean confidence interval for bundles value: -49.36 407.69 95% mean confidence interval for bundles %-change: 36.20% 151.83% Inconclusive result (value mean confidence interval includes 0). total registers in shared programs: 90681 -> 90647 (-0.04%) registers in affected programs: 82 -> 48 (-41.46%) helped: 6 HURT: 0 helped stats (abs) min: 3.0 max: 8.0 x̄: 5.67 x̃: 6 helped stats (rel) min: 27.27% max: 50.00% x̄: 40.04% x̃: 42.86% 95% mean confidence interval for registers value: -8.03 -3.30 95% mean confidence interval for registers %-change: -50.95% -29.13% Registers are helped. total threads in shared programs: 55717 -> 55723 (0.01%) threads in affected programs: 6 -> 12 (100.00%) helped: 6 HURT: 0 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for threads value: 1.00 1.00 95% mean confidence interval for threads %-change: 100.00% 100.00% Threads are helped. total spills in shared programs: 1108 -> 1392 (25.63%) spills in affected programs: 0 -> 284 helped: 0 HURT: 6 total fills in shared programs: 4721 -> 5282 (11.88%) fills in affected programs: 0 -> 561 helped: 0 HURT: 6 Cc: mesa-stable Closes: #7228 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/19092>	2022-10-17 18:56:13 +00:00
Alyssa Rosenzweig	847361ba07	panfrost: Remove load_kernel_input path Now the state tracker's responsible to lower away for us (and the state tracker can do it correctly, our implementation is incorrect with a strict reading of the Gallium contract). Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/18658>	2022-10-05 16:09:21 +00:00

1 2 3 4 5 ...

1058 Commits