KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Alyssa Rosenzweig	ed5a5a9d6d	panfrost: Wire up transfrom feedback sysvals Wire the Gallium interface for transform feedback up to the system values that will be fed into our lowering code. This is based on our existing transform feedback implementation for Midgard. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>	2022-06-04 14:35:56 +00:00
Alyssa Rosenzweig	4e341e70d8	pan/bi: Handle transform feedback intrinsics Translate the intrinsics we introduced to lower away transform feedback into Panfrost system values which the GL driver can handle. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>	2022-06-04 14:35:56 +00:00
Alyssa Rosenzweig	ae3fa6cc1d	pan/bi: Add transform feedback lowering pass Add a simple NIR-based implementation of transform feedback, appropriate for OpenGL ES 3.1 class hardware (compute but no geometry or tessellation shaders). Stores to varyings that will be captured are replaced by stores to transform feedback buffers and some addressing math. This allows implementing the semantic of transform feedback in a compute-like stage. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15720>	2022-06-04 14:35:56 +00:00
Alyssa Rosenzweig	ed4bd8738d	panfrost/ci: Mark draw_buffers_indexed.* as flakes These keep flaking. Icecream95 observes the issue relates to AFBC in the discussion of the flake in issue 6604. Until the root cause can be identified and fixed, mark the tests as known flakes for CI. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16855>	2022-06-03 21:05:22 +00:00
Alyssa Rosenzweig	7535362204	pan/bi: Fix clper_xor on Mali-G31 Mali-G31 has the old CLPER instruction, not the new one, which means we don't get to specify a custom lane op. But the clper_xor helper incorrectly checked the arch, not the implementation quirk. Fixes: `c00e7b729f` ("pan/bi: Optimize abs(derivative)") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reported-by: Icecream95 <ixn@disroot.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16846>	2022-06-02 20:32:43 -04:00
Alyssa Rosenzweig	ad5c84999b	pan/bi: Rework Valhall register alignment Because we lower SPLIT and COLLECT before RA, we need to consider offsets when determining the dimensions of vectors, in order to align properly. Lowering COLLECT post-RA would avoid this special case. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Alyssa Rosenzweig	0770e7a90c	pan/bi: Align 64-bit register sources Similar idea to aligning staging register sources. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Alyssa Rosenzweig	8553dd97ad	pan/bi: Allow vec6 for collects Hit for some Valhall texturing instructions. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Icecream95	1bfff407b9	pan/bi: Use nodearrays for linear constraints Speeds up compiling shaders/skia/781.shader_test in shader-db by 8x (Icecream95). ...At least it did before I extended to support register allocation of vec8. On Valhall, texture instructions require up to 8 consecutive registers. To handle this, provide for vec8 register allocation. Liveness was already (accidentally?) vec8. The increased memory requirement is acceptable given that the interference matrix is now stored sparsely (Alyssa). Icecream95 reports the vec8 changes hurt RA performance by about 1% on average. I consider this acceptable for now. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Icecream95	c70daa74f0	pan/bi: Add nodearray datastructure This is an array which can either be sparse or dense, and was designed to be used to track liveness and interference information. Either a sparse array with sorted indices or dense array is used. Other data structures were tried, such as red-black trees or hash tables, but they were slower. When used for storing constraints, the indices do not have to be sorted as duplicating elements is okay, but the speedup from that was not enough to justify the extra complexity. v2: Add a comment about how to potentially speed it up. But it seems fast enough even without this change. v3: Use a custom struct rather than relying on util_dynarray. v4: Split out functions only used for liveness analysis, rather than the simpler data structure needed for the register interference matrix. If we need to optimize liveness, that can follow on after. Also make it for vec8 (Alyssa). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Icecream95	c24b78cceb	pan/bi: Reverse linear constraint bits This will make it simpler to implement parallel RA where multiple possible registers for a node are tested at once. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16780>	2022-06-02 17:13:16 +00:00
Alyssa Rosenzweig	bc4d42023d	pan/bi: Respect swizzles in nir_op_pack_64_2x32_split Triggered a BIR validation error, which made debugging a breeze. That validation pass (dimensionality checks) gets a lot of use, it seems :-) Fixes: dEQP-VK.ssbo.layout.2_level_array.std430.row_major_mat4x2_comp_access_store_cols Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16724>	2022-06-01 20:08:42 +00:00
Alyssa Rosenzweig	7831508740	panvk: Use vk_image_subresource__count for clears This handles VK_REMAINING_ for us, instead of underflowing and clearing no levels/layers. Fixes dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.* Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16724>	2022-06-01 20:08:42 +00:00
Alyssa Rosenzweig	82d3eb7f18	panfrost: Handle texturing from AFBC on Valhall We need to pack special AFBC-specific plane descriptors instead of the generic plane descriptor. Nothing too fancy here, though. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16800>	2022-06-01 19:44:31 +00:00
Alyssa Rosenzweig	9afa8cc555	panfrost: Support rendering to AFBC on Valhall Add the required handling when packing render target and depth buffer descriptors on Valhall. This is mostly equivalent to Bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16800>	2022-06-01 19:44:31 +00:00
Alyssa Rosenzweig	c2207d27c2	panfrost: Add pan_afbc_compression_mode on Valhall Map a canonical format (a hardware-independent pipe_format) to a compression mode (Valhall-specific hardware enum defined in GenXML). To be used for packing plane descriptors and render target descriptors when AFBC is in use on Valhall. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16800>	2022-06-01 19:44:31 +00:00
Alyssa Rosenzweig	87dcdbdad6	panfrost: Pass arch instead of dev into afbc_format For callers that have a device object, it's easy to pass dev->arch instead of dev. But this requires callers to have a reference to the device, which is tricky for callers that only have the arch via PAN_ARCH. Pass dev->arch instead of dev to accommodate them. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16800>	2022-06-01 19:44:31 +00:00
Alyssa Rosenzweig	2cc2f217d4	panfrost: Fix XML for AFBC header on v9 Misnamed field due to copy/paste fail from Bifrost. Fixes: `c011ea6c26` ("panfrost: Shuffle render target AFBC for Valhall") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16800>	2022-06-01 19:44:31 +00:00
Alyssa Rosenzweig	e596a0423b	pan/mdg: Print outmods when printing IR In particular, this lets us distinguish mul_high from regular mul. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	a099834b97	pan/mdg: Distinguish SSA vs reg when printing IR This makes it easy to match the printed IR with the indices in the NIR. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	520204ae18	pan/mdg: Only print 1 source for moves This makes the printed IR easier to read at a glance. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	0ee24c46e0	pan/mdg: Only print 2 sources for ALU ..and assert the other sources are null. The one place this might fail in the future is for real FMA, but we don't support that for GL. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	9c9db27e3c	pan/mdg: Only print masked components of swizzle This matches the IR printer with the disassembler, making the output of the IR printer much easier to parse at a glance. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	c9093554d0	pan/mdg: Use "<<" instead of "lsl" Easier to read and consistent with C code. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	8c11f4809b	pan/mdg: Remove uppercase write masks These do not convey any additional information, and fail to account for shrinking. In particular, a 64-bit writemask with .keephi would fail to disassemble and instead trip the assertion, since that would be the ZW components. Just delete the broken code. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:10 -04:00
Alyssa Rosenzweig	9e4b457958	pan/mdg: Scalarize with 64-bit sources Otherwise, we can get vec3 with u2u32 with 64-bit sources which we need lowered. Since our current approach is "scalarize all 64-bit ops", we need to check for conversions too. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16798>	2022-06-01 14:24:05 -04:00
Alyssa Rosenzweig	5067a26f44	pan/bi: Use flow control lowering on Valhall Logically at the same part of the compile pipeline as clause scheduling on Bifrost. Lots of similarities, too. Now that we generate flow control only as a late pass, various hacks in the compiler are no longer necessary and are dropped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	a394c32cd2	pan/va: Unit test flow control merging Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	4b06e7f5b6	pan/va: Unit test flow control insertion Test that we correctly track the scoreboard, helper invocations, reconvergence, and ends and insert NOPs to effect this expected flow control. As the pass inserts NOPs but does not otherwise modify the shader, this is easy to test with well-defined behaviour of the pass. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	0fa9204049	pan/va: Respect assigned slots Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	492f4055dd	pan/va: Assign slots roundrobin This should reduce false dependencies with asynchronous instructions. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	aa7393f81a	pan/va: Add flow control merging pass Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	03d8439c0a	pan/va: Terminate helper threads On Bifrost, to terminate helper threads we set the td bit on the clause. On Valhall, we need to use the .discard flow control. Extend the flow control NOP insertion to insert NOP.discard where necessary to terminate helper threads. This should reduce wasted work in fragment shaders. This requires fairly involved data flow analysis, but the handling here should be optimal. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	41b39d6d5d	pan/va: Do scoreboard analysis Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	7e3b9cf754	pan/va: Add pass to insert flow control To set flow control modifiers correctly and efficiently, we need a pass that runs after register allocation and scheduling, but before packing. Add such a pass. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	82b1897900	pan/bi: Print flow control on instructions This helps debug the flow control lowering passes on Valhall. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	c0180f6bd3	pan/bi: Export helper termination analysis The current helper termination analysis code is hardwired for clauses, so it won't work for Valhall. However, the bulk of it is dataflow analysis which is portable between Bifrost and Valhall. Export the interesting bits so we can reuse them on Valhall. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	7bb635316b	pan/bi: Export bi_block_add_successor For use in unit tests that need to create blocks. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	d7c6b7c9d2	pan/bi: Extract bit_block helper Convenience for unit tests which need to create multiple blocks, to test global passes. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	b0edd92156	pan/bi: Add a trivial ctx->inputs for unit tests So we can unit test the flow control insertion which needs to gate some behaviour on not being in a blend shader. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	218148d38a	pan/bi: Add ASSERT_SHADER_EQUAL macro Useful for whole-program unit tests. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	4627cd99de	pan/bi: Preserve flow control for non-psiz variant Otherwise we will get INSTR_INVALID_ENC faults when deleting the final STORE.end instruction, after we rework our flow control code. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	c846e0812b	pan/bi: Add slot to bi_instr For better handling of message-passing instructions. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Alyssa Rosenzweig	616df0e97d	pan/bi: Extend bi_scoreboard_state for finer tracking We need to insert dependencies for varyings and memory access. Currently, the Bifrost scoreboarding pass just treats these as barriers, but this is too heavy handed. Extend the scoreboard data structure so we can do better. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16804>	2022-06-01 16:14:38 +00:00
Daniel Schürmann	bd151a256e	nir/opt_vectorize: add callback for max vectorization width The callback allows to request different vectorization factors per instruction depending on e.g. bitsize or opcode. This patch also removes using the vectorize_vec2_16bit option from nir_opt_vectorize(). Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13080>	2022-06-01 11:41:44 +00:00
Emma Anholt	7ae206d76e	panfrost: always print the bad ALU op if we're failing to translate. CI failure could have told me what needed fixing, but no... Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16437>	2022-06-01 10:56:35 +00:00
Emma Anholt	7472bb4bad	glsl,nir: Move i/umulExtended lowering to NIR. NIR already has the necessary lowering, and the GLSL lowering violates GLSL IR validation rules. Once quadop lowering was turned off, the IR validation at the end of the compile path on DEBUG builds caught the problem. In order to move the lowering to NIR, though, we need to make sure that drivers supporting these functions actually have the lowering flag set. xfails added for t860, where apparently this tickles a variety of existing 64-bit bugs in the backend. Fixes: #6461 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Mykhailo Skorokhodov <mykhailo.skorokhodov@globallogic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16437>	2022-06-01 10:56:35 +00:00
Juan A. Suarez Romero	836ce97f5e	ci: bump VK-GL-CTS to 1.3.2.0 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Acked-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-by: Alejandro Piñeiro <apinheiro@igalia.com> Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Acked-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16689>	2022-05-31 15:02:08 +00:00
Alyssa Rosenzweig	0170500627	pan/bi: Interpolate varyings at 16-bit On Bifrost, we have a single "load float varying" instruction that controls the bit size of the result, allowing us to fold a f2f16 into the load. However, the larger benefit is that 16-bit varying loads are interpolated at 16-bit. Arm claims that the varying unit has a 32-bit data path, allowing 16-bit varyings to be interpolated in half the cycles from 32-bit. This change should therefore improve performance for workloads that are varying units. This means we want to be aggressive about 16-bit varying loads, even if it costs some extra f2f32 instructions. glmark2 total score on Mali-G52 up from 1173fps to 1218fps with particular wins in -brefract, -bshadow, -bjellyfish, and -bshading. total instructions in shared programs: 2432246 -> 2423668 (-0.35%) instructions in affected programs: 516056 -> 507478 (-1.66%) helped: 3641 HURT: 432 helped stats (abs) min: 1.0 max: 12.0 x̄: 2.91 x̃: 2 helped stats (rel) min: 0.08% max: 54.55% x̄: 9.88% x̃: 5.71% HURT stats (abs) min: 1.0 max: 42.0 x̄: 4.71 x̃: 4 HURT stats (rel) min: 0.23% max: 200.00% x̄: 12.58% x̃: 6.37% 95% mean confidence interval for instructions value: -2.21 -2.00 95% mean confidence interval for instructions %-change: -7.92% -7.07% Instructions are helped. total tuples in shared programs: 1941309 -> 1934647 (-0.34%) tuples in affected programs: 353169 -> 346507 (-1.89%) helped: 3233 HURT: 453 helped stats (abs) min: 1.0 max: 14.0 x̄: 2.46 x̃: 2 helped stats (rel) min: 0.12% max: 50.00% x̄: 9.90% x̃: 5.56% HURT stats (abs) min: 1.0 max: 25.0 x̄: 2.85 x̃: 2 HURT stats (rel) min: 0.22% max: 150.00% x̄: 8.96% x̃: 5.26% 95% mean confidence interval for tuples value: -1.89 -1.72 95% mean confidence interval for tuples %-change: -8.01% -7.15% Tuples are helped. total clauses in shared programs: 357354 -> 356610 (-0.21%) clauses in affected programs: 25794 -> 25050 (-2.88%) helped: 994 HURT: 317 helped stats (abs) min: 1.0 max: 3.0 x̄: 1.16 x̃: 1 helped stats (rel) min: 1.49% max: 33.33% x̄: 10.78% x̃: 10.00% HURT stats (abs) min: 1.0 max: 4.0 x̄: 1.31 x̃: 1 HURT stats (rel) min: 1.19% max: 50.00% x̄: 13.56% x̃: 8.33% 95% mean confidence interval for clauses value: -0.63 -0.50 95% mean confidence interval for clauses %-change: -5.63% -4.16% Clauses are helped. total cycles in shared programs: 167697.96 -> 167431.15 (-0.16%) cycles in affected programs: 12638.29 -> 12371.48 (-2.11%) helped: 2652 HURT: 350 helped stats (abs) min: 0.04166399999999726 max: 0.75 x̄: 0.11 x̃: 0 helped stats (rel) min: 0.12% max: 100.00% x̄: 14.39% x̃: 5.04% HURT stats (abs) min: 0.041665999999999315 max: 0.5833329999999997 x̄: 0.11 x̃: 0 HURT stats (rel) min: 0.00% max: 75.00% x̄: 7.90% x̃: 4.71% 95% mean confidence interval for cycles value: -0.09 -0.08 95% mean confidence interval for cycles %-change: -12.56% -11.02% Cycles are helped. total arith in shared programs: 74169.46 -> 73891.71 (-0.37%) arith in affected programs: 13885.87 -> 13608.12 (-2.00%) helped: 3215 HURT: 445 helped stats (abs) min: 0.04166399999999726 max: 0.5416680000000014 x̄: 0.10 x̃: 0 helped stats (rel) min: 0.12% max: 100.00% x̄: 14.16% x̃: 6.67% HURT stats (abs) min: 0.041665999999999315 max: 1.125 x̄: 0.12 x̃: 0 HURT stats (rel) min: 0.00% max: 100.00% x̄: 9.76% x̃: 5.49% 95% mean confidence interval for arith value: -0.08 -0.07 95% mean confidence interval for arith %-change: -11.91% -10.59% Arith are helped. total texture in shared programs: 11936 -> 11931 (-0.04%) texture in affected programs: 20 -> 15 (-25.00%) helped: 10 HURT: 0 helped stats (abs) min: 0.5 max: 0.5 x̄: 0.50 x̃: 0 helped stats (rel) min: 14.29% max: 100.00% x̄: 45.71% x̃: 33.33% 95% mean confidence interval for texture value: -0.50 -0.50 95% mean confidence interval for texture %-change: -73.16% -18.26% Texture are helped. total vary in shared programs: 4180.88 -> 3447.19 (-17.55%) vary in affected programs: 2109.88 -> 1376.19 (-34.77%) helped: 2202 HURT: 39 helped stats (abs) min: 0.0625 max: 1.4375 x̄: 0.34 x̃: 0 helped stats (rel) min: 2.38% max: 66.67% x̄: 40.43% x̃: 50.00% HURT stats (abs) min: 0.125 max: 0.375 x̄: 0.26 x̃: 0 HURT stats (rel) min: 0.00% max: 300.00% x̄: 92.54% x̃: 23.08% 95% mean confidence interval for vary value: -0.34 -0.32 95% mean confidence interval for vary %-change: -39.22% -37.01% Vary are helped. total quadwords in shared programs: 1689664 -> 1684852 (-0.28%) quadwords in affected programs: 265522 -> 260710 (-1.81%) helped: 2864 HURT: 447 helped stats (abs) min: 1.0 max: 14.0 x̄: 2.10 x̃: 2 helped stats (rel) min: 0.15% max: 31.58% x̄: 6.05% x̃: 4.65% HURT stats (abs) min: 1.0 max: 22.0 x̄: 2.67 x̃: 2 HURT stats (rel) min: 0.27% max: 38.46% x̄: 6.79% x̃: 4.55% 95% mean confidence interval for quadwords value: -1.54 -1.37 95% mean confidence interval for quadwords %-change: -4.55% -4.08% Quadwords are helped. total threads in shared programs: 53656 -> 53688 (0.06%) threads in affected programs: 32 -> 64 (100.00%) helped: 32 HURT: 0 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% 95% mean confidence interval for threads value: 1.00 1.00 95% mean confidence interval for threads %-change: 100.00% 100.00% Threads are helped. total preloads in shared programs: 116212 -> 103476 (-10.96%) preloads in affected programs: 45222 -> 32486 (-28.16%) helped: 3022 HURT: 11 helped stats (abs) min: 1.0 max: 11.0 x̄: 4.23 x̃: 4 helped stats (rel) min: 7.14% max: 68.75% x̄: 30.39% x̃: 25.00% HURT stats (abs) min: 2.0 max: 4.0 x̄: 3.45 x̃: 4 HURT stats (rel) min: 14.29% max: 50.00% x̄: 25.93% x̃: 25.00% 95% mean confidence interval for preloads value: -4.26 -4.14 95% mean confidence interval for preloads %-change: -30.68% -29.69% Preloads are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Tested-by: Chris Healy cphealy@gmail.com Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>	2022-05-30 17:49:44 -04:00
Alyssa Rosenzweig	93f69e4b1c	pan/bi: Model Valhall source formats LD_VAR_BUF instructions on Valhall take a source format, indicating the in-memory format of the varying independent from the register format, which we still model within the compiler for compatibility with Bifrost. (Prior to Valhall, source format is specified in the attribute descriptor as a physical pixel format.) Model this information, allowing us to generate fp16 LD_VAR_BUF instructions correctly on Valhall. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>	2022-05-30 17:49:44 -04:00
Alyssa Rosenzweig	06886c3861	pan/bi: Make LD_VAR w=format instead of w=vecsize Fixes a vector dimension validation failure in dEQP-GLES3.functional.shaders.indexing.varying_array.vec4_static_write_dynamic_read after we enable fp16 varyings. No shader-db changes, as we don't yet support fp16 varyings. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>	2022-05-30 17:49:44 -04:00
Alyssa Rosenzweig	a9b13a1867	pan/va: Fill in missing src_flat16 enum Valhall gains(?) the ability to flatshade 16-bit varyings, this is indicated by a particular source format. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>	2022-05-30 17:49:44 -04:00
Alyssa Rosenzweig	e898e2466b	pan/bi: Add VAR_TEX fusing unit test As fusing VAR_TEX is an optimization, it's helpful to have unit tests since functional tests won't check that the optimization triggers when expected. Originally written when I was touching the VAR_TEX code. Those changes have since been dropped by the unit test remains useful. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16752>	2022-05-30 17:48:59 -04:00
Alyssa Rosenzweig	42a4a123a6	pan/bi: Don't allow spilling coverage mask writes The register precolouring logic assumes that coverage masks are always in R60, so spilling them causes incorrect results. We could do better. Fixes on Valhall: dEQP-GLES3.functional.ubo.random.all_per_block_buffers.28 Fixes: `3df5446cbd` ("pan/bi: Simplify register precolouring in the IR") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16748>	2022-05-30 14:00:55 +00:00
Alyssa Rosenzweig	67f5721349	panfrost: Set allow_rotating_primitives On Valhall, the driver should set this flag if the hardware may rotate primitives. This happens if: 1. The rasterization of lines does not matter, AND 2. The provoking vertex does not matter. The first condition we may satisfy by checking for LINES and the second by checking for flat shading. Otherwise, we should set this flag to allow optimizations. This may be more efficient for tiling. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16748>	2022-05-30 14:00:55 +00:00
Jason Ekstrand	0eee071038	panvk: Use the vk_buffer base struct Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16607>	2022-05-27 18:39:00 -05:00
Alyssa Rosenzweig	01ba3460a9	pan/bi: Test CMP result_type optimization Add unit tests ensuring the optimization applies in all the cases we care about, as functional integration tests (CTS and Piglit) won't test this. Also add unit tests for a few cases where we specifically cannot fuse, in case these cases are missed by the tests. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16725>	2022-05-27 12:14:22 +00:00
Alyssa Rosenzweig	501a66cb5c	pan/bi: Fuse result types In NIR, comparison instructions always produce 0/~0 results. For other result types, a separate b2f32 or b2i32 instruction is used to transform the result. However, Mali's comparison instructions have modifiers for these alternate result types, so we can implement expressions like int(a < b) and float(a == b) in single instruction. Add a peephole optimization to fuse comparisons with result type transformations. Results on Mali-G52: total instructions in shared programs: 2439696 -> 2434339 (-0.22%) instructions in affected programs: 418703 -> 413346 (-1.28%) helped: 1630 HURT: 0 helped stats (abs) min: 1.0 max: 28.0 x̄: 3.29 x̃: 2 helped stats (rel) min: 0.11% max: 19.35% x̄: 1.64% x̃: 1.39% 95% mean confidence interval for instructions value: -3.44 -3.13 95% mean confidence interval for instructions %-change: -1.72% -1.56% Instructions are helped. total tuples in shared programs: 1946581 -> 1943005 (-0.18%) tuples in affected programs: 251742 -> 248166 (-1.42%) helped: 1113 HURT: 11 helped stats (abs) min: 1.0 max: 32.0 x̄: 3.23 x̃: 2 helped stats (rel) min: 0.17% max: 15.38% x̄: 1.80% x̃: 1.38% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.45 x̃: 1 HURT stats (rel) min: 0.21% max: 3.12% x̄: 1.23% x̃: 0.89% 95% mean confidence interval for tuples value: -3.35 -3.01 95% mean confidence interval for tuples %-change: -1.88% -1.66% Tuples are helped. total clauses in shared programs: 357791 -> 357349 (-0.12%) clauses in affected programs: 15879 -> 15437 (-2.78%) helped: 371 HURT: 3 helped stats (abs) min: 1.0 max: 8.0 x̄: 1.20 x̃: 1 helped stats (rel) min: 0.80% max: 33.33% x̄: 3.85% x̃: 2.17% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 2.94% max: 5.26% x̄: 4.49% x̃: 5.26% 95% mean confidence interval for clauses value: -1.27 -1.09 95% mean confidence interval for clauses %-change: -4.21% -3.36% Clauses are helped. total cycles in shared programs: 167922.04 -> 167810.71 (-0.07%) cycles in affected programs: 6772.08 -> 6660.75 (-1.64%) helped: 655 HURT: 12 helped stats (abs) min: 0.041665999999999315 max: 1.3333319999999986 x̄: 0.17 x̃: 0 helped stats (rel) min: 0.18% max: 20.00% x̄: 2.02% x̃: 1.60% HURT stats (abs) min: 0.041665999999999315 max: 0.125 x̄: 0.05 x̃: 0 HURT stats (rel) min: 0.21% max: 3.80% x̄: 1.23% x̃: 0.88% 95% mean confidence interval for cycles value: -0.18 -0.16 95% mean confidence interval for cycles %-change: -2.10% -1.81% Cycles are helped. total arith in shared programs: 74393.17 -> 74243.08 (-0.20%) arith in affected programs: 10157.50 -> 10007.42 (-1.48%) helped: 1129 HURT: 12 helped stats (abs) min: 0.041665999999999315 max: 1.3333319999999986 x̄: 0.13 x̃: 0 helped stats (rel) min: 0.18% max: 50.00% x̄: 1.94% x̃: 1.40% HURT stats (abs) min: 0.041665999999999315 max: 0.125 x̄: 0.05 x̃: 0 HURT stats (rel) min: 0.21% max: 3.80% x̄: 1.23% x̃: 0.88% 95% mean confidence interval for arith value: -0.14 -0.12 95% mean confidence interval for arith %-change: -2.06% -1.76% Arith are helped. total quadwords in shared programs: 1692019 -> 1688164 (-0.23%) quadwords in affected programs: 216669 -> 212814 (-1.78%) helped: 1148 HURT: 11 helped stats (abs) min: 1.0 max: 41.0 x̄: 3.37 x̃: 2 helped stats (rel) min: 0.17% max: 17.24% x̄: 2.25% x̃: 1.73% HURT stats (abs) min: 1.0 max: 2.0 x̄: 1.09 x̃: 1 HURT stats (rel) min: 0.60% max: 1.32% x̄: 0.85% x̃: 0.83% 95% mean confidence interval for quadwords value: -3.49 -3.16 95% mean confidence interval for quadwords %-change: -2.33% -2.10% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16725>	2022-05-27 12:14:22 +00:00
David Heidelberg	c9f0a511e0	ci/panfrost: add RoR and Nheko traces Signed-off-by: David Heidelberg <david.heidelberg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16633>	2022-05-27 06:51:38 +00:00
Alyssa Rosenzweig	0255f554f3	panfrost: Advertise 16x16 tiled AFBC Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	3fbfd356af	panfrost: Add helper checking tiled AFBC support Tiled AFBC support was introduced with v7. Add a helper encoding this fact. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	5fa274fee4	panfrost: Handle AFBC Tiled Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	b63dad3ce5	panfrost: Put comment in correct #ifdef Minor fix to make the code less confusing. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	bd529b7983	panfrost: Fix AFBC flags on v6 Tiled headers and bounds checking were introduced with v7. The flags don't exist on v6. Fix the XML accordingly so we don't accidentally use features too new for the hardware. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	166d879ff0	panfrost: Add 1x1 layout unit tests These check the alignments are correct. Of course, ideally these cases aren't hit in practice, since it's a waste of memory. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	65ba39f84c	panfrost: Add a tiled 16x16 layout unit test To exercise the layout code introduced in this series. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	d11945cd85	panfrost: Calculate header_size based on row_stride The header size is the header stride times the number of rows in the header (number of tiles of superblocks). We already calculate the header stride, so eliminate the separate header size calculation. Delete the old header size calculation. It has no notion of wide blocks, let alone tiled AFBC headers. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	0cf6091bd0	panfrost: Add 3D texture layout unit test 3D AFBC is pretty subtle, let's make sure we have adequate unit test coverage. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	5944bbfa94	panfrost: Add AFBC stride unit tests Demonstrating correctness of the low level calculations. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	544a8894fc	panfrost: Align layouts to tiles of superblocks Required to satisfy the alignment constraints on tiled AFBC. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	9c9b7f7a42	panfrost: Support tiled AFBC in stride helpers Part 1 of tiled AFBC. This requires modifier information. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	5c86f53112	panfrost: Add pan_afbc_tile_size helper To unify calculations with linear and tiled AFBC formats. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	b7c18160d3	panfrost: Fix is_wide return type By inspection. Fixes: `e4ee2c213a` ("panfrost: Extract panfrost_afbc_is_wide helper") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	6b0ff7da48	panfrost: Extract pan_afbc_row_stride helper Extract a helper for calculating AFBC strides. This is used in two places in pan_layout. It will need extension for tiled AFBC, and the extended version could benefit from unit testing. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	d8a4c9b505	panfrost: Extract afbc_stride_blocks helper Let's keep all the AFBC computations inside the layout code, to keep pan_cs dumb. This helper will need some extension for tiled AFBC. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16697>	2022-05-26 15:56:32 +00:00
Alyssa Rosenzweig	96d9093c19	pan/bi: Allow CSEing LEA_BUF_IMM Cleans up the code gen a lot in varying shaders. Instruction count regression due to how we handle 64-bit on Valhall. (TODO: A better solution for that...) total instructions in shared programs: 2730186 -> 2736193 (0.22%) instructions in affected programs: 775825 -> 781832 (0.77%) helped: 2010 HURT: 4433 helped stats (abs) min: 1.0 max: 18.0 x̄: 2.16 x̃: 2 helped stats (rel) min: 0.16% max: 26.67% x̄: 3.75% x̃: 2.22% HURT stats (abs) min: 1.0 max: 10.0 x̄: 2.33 x̃: 2 HURT stats (rel) min: 0.20% max: 23.08% x̄: 4.79% x̃: 2.79% 95% mean confidence interval for instructions value: 0.87 1.00 95% mean confidence interval for instructions %-change: 1.98% 2.27% Instructions are HURT. total cycles in shared programs: 161178.77 -> 144303.77 (-10.47%) cycles in affected programs: 85720 -> 68845 (-19.69%) helped: 6910 HURT: 0 helped stats (abs) min: 1.0 max: 18.0 x̄: 2.44 x̃: 2 helped stats (rel) min: 1.05% max: 41.18% x̄: 19.72% x̃: 20.00% 95% mean confidence interval for cycles value: -2.48 -2.41 95% mean confidence interval for cycles %-change: -19.86% -19.58% Cycles are helped. total cvt in shared programs: 13655.45 -> 14013 (2.62%) cvt in affected programs: 2978.06 -> 3335.61 (12.01%) helped: 381 HURT: 5242 helped stats (abs) min: 0.015625 max: 0.0625 x̄: 0.02 x̃: 0 helped stats (rel) min: 0.37% max: 50.00% x̄: 7.61% x̃: 3.85% HURT stats (abs) min: 0.015625 max: 0.296875 x̄: 0.07 x̃: 0 HURT stats (rel) min: 0.00% max: 400.00% x̄: 28.51% x̃: 16.00% 95% mean confidence interval for cvt value: 0.06 0.06 95% mean confidence interval for cvt %-change: 25.13% 27.00% Cvt are HURT. total ls in shared programs: 147856 -> 130980 (-11.41%) ls in affected programs: 85725 -> 68849 (-19.69%) helped: 6911 HURT: 0 helped stats (abs) min: 1.0 max: 18.0 x̄: 2.44 x̃: 2 helped stats (rel) min: 1.05% max: 41.18% x̄: 19.72% x̃: 20.00% 95% mean confidence interval for ls value: -2.48 -2.41 95% mean confidence interval for ls %-change: -19.86% -19.58% Ls are helped. total quadwords in shared programs: 1483576 -> 1486872 (0.22%) quadwords in affected programs: 73816 -> 77112 (4.47%) helped: 286 HURT: 698 helped stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8 helped stats (rel) min: 2.38% max: 50.00% x̄: 16.83% x̃: 16.67% HURT stats (abs) min: 8.0 max: 8.0 x̄: 8.00 x̃: 8 HURT stats (rel) min: 2.78% max: 100.00% x̄: 37.38% x̃: 16.67% 95% mean confidence interval for quadwords value: 2.89 3.80 95% mean confidence interval for quadwords %-change: 19.02% 24.22% Quadwords are HURT. total threads in shared programs: 53186 -> 53189 (<.01%) threads in affected programs: 3 -> 6 (100.00%) helped: 3 HURT: 0 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% total fills in shared programs: 2172 -> 2163 (-0.41%) fills in affected programs: 11 -> 2 (-81.82%) helped: 1 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16710>	2022-05-25 15:51:15 +00:00
Alyssa Rosenzweig	569e5dc745	pan/bi: Schedule for pressure pre-RA Add a bottom-up pre-RA list scheduler that aims to reduce register pressure, roughly the same as we use on Midgard to great effect. It uses a simple heuristic: greedily select instructions that have reduce liveness. To avoid regressions, the algorithm throws away schedules that increase maximum number of lives (used as an estimate of register pressure -- if we had SSA form, this would be exact). We might be better off using Sarkar. But for something I could type out in an afternoon, I'll happily accept a >50% reduction in spills. Instruction count is regressed due to extra moves around the blend shader ABI in some cases, at least on Bifrost this is mostly hidden by the clause scheduler. Thread count and spills/fills are both much improved here. There are numerous opportunities for future improvements to pre-RA scheduling: * Better heuristics? (Something more global than liveness alone) * Reducing false dependencies with memory access * Improve ILP for message-passing instructions? This is a tradeoff. * Simplify the code if we have SSA in the future. But for now, I think this is well worth it already. v2: Various clean-ups and memory leak fix (Icecream95). Reduce false dependencies to eliminate spilling in more shaders. shader-db stats on Mali-G52: total instructions in shared programs: 2438841 -> 2439698 (0.04%) instructions in affected programs: 1206421 -> 1207278 (0.07%) helped: 3113 HURT: 4011 helped stats (abs) min: 1.0 max: 50.0 x̄: 3.25 x̃: 2 helped stats (rel) min: 0.13% max: 44.83% x̄: 4.09% x̃: 2.11% HURT stats (abs) min: 1.0 max: 18.0 x̄: 2.73 x̃: 2 HURT stats (rel) min: 0.11% max: 57.14% x̄: 3.86% x̃: 2.07% 95% mean confidence interval for instructions value: 0.02 0.22 95% mean confidence interval for instructions %-change: 0.23% 0.54% Instructions are HURT. total tuples in shared programs: 1927077 -> 1946583 (1.01%) tuples in affected programs: 1118627 -> 1138133 (1.74%) helped: 2874 HURT: 6295 helped stats (abs) min: 1.0 max: 82.0 x̄: 3.51 x̃: 2 helped stats (rel) min: 0.17% max: 33.33% x̄: 4.60% x̃: 3.57% HURT stats (abs) min: 1.0 max: 47.0 x̄: 4.70 x̃: 3 HURT stats (rel) min: 0.20% max: 50.00% x̄: 5.16% x̃: 4.32% 95% mean confidence interval for tuples value: 2.00 2.25 95% mean confidence interval for tuples %-change: 1.97% 2.23% Tuples are HURT. total clauses in shared programs: 356053 -> 357793 (0.49%) clauses in affected programs: 151578 -> 153318 (1.15%) helped: 2196 HURT: 3813 helped stats (abs) min: 1.0 max: 49.0 x̄: 2.16 x̃: 1 helped stats (rel) min: 0.18% max: 69.01% x̄: 10.26% x̃: 8.33% HURT stats (abs) min: 1.0 max: 25.0 x̄: 1.70 x̃: 1 HURT stats (rel) min: 0.57% max: 66.67% x̄: 10.64% x̃: 8.33% 95% mean confidence interval for clauses value: 0.22 0.36 95% mean confidence interval for clauses %-change: 2.68% 3.33% Clauses are HURT. total cycles in shared programs: 167761.17 -> 167922.04 (0.10%) cycles in affected programs: 24494.21 -> 24655.08 (0.66%) helped: 862 HURT: 3054 helped stats (abs) min: 0.041665999999999315 max: 53.0 x̄: 0.69 x̃: 0 helped stats (rel) min: 0.28% max: 76.81% x̄: 5.65% x̃: 3.03% HURT stats (abs) min: 0.041665999999999315 max: 2.0416659999999993 x̄: 0.25 x̃: 0 HURT stats (rel) min: 0.26% max: 41.18% x̄: 4.91% x̃: 3.92% 95% mean confidence interval for cycles value: -0.04 0.12 95% mean confidence interval for cycles %-change: 2.36% 2.81% Inconclusive result (value mean confidence interval includes 0). total arith in shared programs: 73875.37 -> 74393.17 (0.70%) arith in affected programs: 43142.42 -> 43660.21 (1.20%) helped: 3632 HURT: 5443 helped stats (abs) min: 0.041665999999999315 max: 1.2083360000000027 x̄: 0.15 x̃: 0 helped stats (rel) min: 0.22% max: 100.00% x̄: 6.70% x̃: 4.76% HURT stats (abs) min: 0.041665999999999315 max: 2.0416659999999993 x̄: 0.19 x̃: 0 HURT stats (rel) min: 0.00% max: 166.67% x̄: 5.91% x̃: 4.08% 95% mean confidence interval for arith value: 0.05 0.06 95% mean confidence interval for arith %-change: 0.65% 1.07% Arith are HURT. total texture in shared programs: 11936 -> 11936 (0.00%) texture in affected programs: 0 -> 0 helped: 0 HURT: 0 total vary in shared programs: 4180.88 -> 4180.88 (0.00%) vary in affected programs: 0 -> 0 helped: 0 HURT: 0 total ldst in shared programs: 137551 -> 137028 (-0.38%) ldst in affected programs: 834 -> 311 (-62.71%) helped: 13 HURT: 0 helped stats (abs) min: 15.0 max: 53.0 x̄: 40.23 x̃: 53 helped stats (rel) min: 19.15% max: 100.00% x̄: 68.11% x̃: 76.81% 95% mean confidence interval for ldst value: -50.49 -29.98 95% mean confidence interval for ldst %-change: -84.37% -51.84% Ldst are helped. total quadwords in shared programs: 1684883 -> 1692021 (0.42%) quadwords in affected programs: 949463 -> 956601 (0.75%) helped: 3981 HURT: 5098 helped stats (abs) min: 1.0 max: 86.0 x̄: 3.53 x̃: 3 helped stats (rel) min: 0.18% max: 33.33% x̄: 5.82% x̃: 4.48% HURT stats (abs) min: 1.0 max: 50.0 x̄: 4.15 x̃: 3 HURT stats (rel) min: 0.17% max: 50.00% x̄: 5.11% x̃: 3.85% 95% mean confidence interval for quadwords value: 0.67 0.90 95% mean confidence interval for quadwords %-change: 0.17% 0.47% Quadwords are HURT. total threads in shared programs: 53276 -> 53653 (0.71%) threads in affected programs: 581 -> 958 (64.89%) helped: 445 HURT: 68 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: 0.68 0.79 95% mean confidence interval for threads %-change: 75.70% 84.53% Threads are helped. total preloads in shared programs: 116312 -> 116312 (0.00%) preloads in affected programs: 0 -> 0 helped: 0 HURT: 0 total loops in shared programs: 128 -> 128 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total spills in shared programs: 92 -> 37 (-59.78%) spills in affected programs: 55 -> 0 helped: 13 HURT: 0 total fills in shared programs: 658 -> 190 (-71.12%) fills in affected programs: 468 -> 0 helped: 13 HURT: 0 Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16378>	2022-05-25 14:40:12 +00:00
Alyssa Rosenzweig	2fb5ceab7a	pan/bi: Recoalesce tied operands after spilling Otherwise we can fail to allocate tied operands if we spill the tied operand. Seen in shaders/android/com.miHoYo.GenshinImpact/16.shader_test with a particularly bad scheduling causing excessive spilling. No shader-db changes. Fixes: `bc17288697` ("pan/bi: Lower split/collect before RA") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16378>	2022-05-25 14:40:12 +00:00
Icecream95	a4323b0979	panfrost: Only write depth / stencil once if MRT is used We can't assume that RT0 will be written, so this has to be based on whether a combined store has already been emitted, not the location of the store. Emit a non-special combined_store intrinsic that only writes colour for the other RTs, as reordering stores breaks the Midgard compiler. Fixes: `d37e901e35` ("pan/mdg: Add new depth store lowering") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6527 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16685>	2022-05-24 16:13:33 +00:00
Icecream95	0a53ebabcd	pan/mdg: Read base for combined stores Fixes depth/stencil writes with MRT. Fixes: `b3d7272753` ("pan/mdg: Don't read base for combined stores") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16685>	2022-05-24 16:13:33 +00:00
Icecream95	f1a226dd24	pan/bi: Read base for combined stores Fixes depth/stencil writes with MRT. Fixes: `996645e479` ("pan/bi: Don't read base for combined stores") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16685>	2022-05-24 16:13:33 +00:00
Icecream95	9f9ed959bd	nir: Add store_combined_output_pan BASE back It's meaningful for this intrinsic and so does not add noise to the lowering pass. (Although dual-source writes must be to RT 0, depth and stencil writes, which store_combined_output_pan is also used for, can still be done with MRT enabled.) Fixes: `5c168f09eb` ("nir: Eliminate store_combined_output_pan BASE") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16685>	2022-05-24 16:13:33 +00:00
Icecream95	2f2ddfa0ac	panfrost: Move patched_s out of the pan_blitter_views struct The struct is returned from a function, so in debug builds the address may change after returning, and pointers to patched_s will be broken. Pass the pointer to the patched stencil view as a parameter to pan_preload_get_views to avoid this. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16343>	2022-05-20 23:17:07 +00:00
Icecream95	f1f39fa645	panfrost: Increase the limit for blend shader variants Qt uses blend constants to set text colour, this will allow more colours onscreen before thrashing happens. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16343>	2022-05-20 23:17:07 +00:00
Icecream95	80404c8b64	panfrost: Copy blend constant into variant even when reusing it Otherwise future lookups will match searches for the old constant. Fixes: `bbff09b952` ("panfrost: Move the blend shader cache at the device level") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6355 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16343>	2022-05-20 23:17:07 +00:00
Alyssa Rosenzweig	d6ece34d0c	pan/va: Use ^ instead of ` to indicate last-use This syncs the ISA syntax with other Valhall ISA users. It's also somewhat easier to read. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	9fb8ca1851	pan/va: Remove DISCARD.f32 destination It doesn't actually write anything. This is a pointless divergence from Bifrost. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	444469d64e	pan/va: Handle 2-src blend in lower_split_src Fixes assertion fail in shaders/dolphin/smg.1.shader_test Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	0576cad958	pan/bi: Validate vector widths Now that our IR is much more strongly typed, and RA code quality depends on correct typing, add a validation pass to make sure we didn't screw it up. This pass found a massive number of bugs in early versions of this series. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	4c1bb23a86	pan/bi: Validate preload constraints are satisfied We tightened the rules around preloading substantially and take advantage of the rules in RA. The safe helpers it introduced should ensure the rules are followed, but just in case, add a validation pass to check our work. This pass found (multiple) bugs in early versions of this series. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	3636cddde1	pan/bi: See through splits for var_tex fusion Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	1f25f78a9f	pan/bi: Optimize split of collect Required to get decent codegen from UBO pushing. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	4a8bde2190	pan/bi: Don't propagate discard Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	d81b872465	pan/bi: Remove liveness metadata tracking We don't use it for anything, and with no pass infrastructure it's just an accident waiting to happen. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	3df5446cbd	pan/bi: Simplify register precolouring in the IR In the current IR, any register may be preloaded by reading it anywhere, and any register may be precoloured by writing it anywhere. This is convenient for instruction selection, but requires the register allocator to do considerable gymnastics to ensure it doesn't clobber precoloured registers. It also breaks the purity of our SSA representation, which complicates optimization passes (e.g. copyprop). Let's trade some instruction selection complexity for simplifying register allocation by constraining how register precolouring works. Under the new model: * Registers may only be preloaded at the start of the program. * Precoloured destinations are handled explicitly by RA. Internally, a stronger invariant is placed for preloading: registers may only be preloaded by MOV.i32 instructions at the beginning of the block, and these moves must be unique. These invariants ensure RA can trivially coalesce the moves. A bi_preload helper is added as a safe version of bi_register respecting these invariants, allowing a smooth transition for instruction selection. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	dab5b62ecf	pan/bi: Remove bi_word and bi_word_node They are no longer used, as offsets are no longer used for normal values (only for FAU). Keep it like that. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	f0184cf218	pan/bi: Scalarize copyprop Reduces memory footprint. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	c6349278f9	pan/bi: Scalarize modifier propagation Reduces memory footprint. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	e332e2edc1	pan/bi: Scalarize bi_opt_cse Reduces memory footprint. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	187dd382cb	pan/bi: Scalarize bi_lower_swizzle Reduces memory footprint. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	5b1c642cee	pan/va: Don't use bi_word in FAU unit test It will be removed shortly, as the FAU construction helper should be used instead. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	67569b3c23	pan/va: Use split for 64-bit lowering Written in this way, this pass looks pretty silly... Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	5febeae58e	pan/bi: Emit collect and split ..Rather than using offsets during instruction selection. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	4731e9e55a	pan/bi: Simplfy BLEND emit We don't need to collect anything, now that Valhall handles this case correctly. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	7bfaa119f4	pan/bi: Lift split/collect cache from AGX Design based on ACO (and fruitful discussions with Daniel). Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	8fdb01b96f	pan/bi: Create COLLECT during isel This transitions us away from the fake SSA we currently use for vectors. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	5c0977d230	pan/bi: Expand MAX_DESTS to 4 For splits. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	9924e6f291	pan/bi: Fix mov and pack_32_2x16 Move can take in a vector and write a scalar, depending on the swizzle. We need to handle this case. Split out mov and pack_32_2x16 so we can specify correct behaviour for both. Also drop unused 1-bit boolean stuff which obscured the fix. Fixes: `76cea8e27b` ("panfrost: Fix pack_32_2x16 implementation") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	bc17288697	pan/bi: Lower split/collect before RA For transitioning to the new scalarized IR. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	0c7f126277	pan/bi: Add bi_before_block cursor Useful for preloading. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	298d20f805	pan/bi: Add collect and split instructions These move-like instructions will be generated during instruction selection and lowered before/after register allocation. These need special printer support until we get dynamic sources/destinations. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	afd88d1380	pan/bi: Add source/destination counts In preparation for dynamic allocation, as needed for phi nodes and parallel copies. For now, it just serves to simplify the semantics of splits and collects. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	0523b6b89b	pan/bi: Use value-based interference with LCRA "Revisiting Out-of-SSA Translation for Correctness, Code Quality, and Efficiency" discusses "value-based interference": two variables interfere if and only if there exists a point in the program where they are both live with different values. In particular, the source and destination of a move do not interfere a priori, because they have the same value at that point in the program. (If a later instruction overwrites one, the required interference will be added there). We can use this idea to avoid some extra interferences, avoiding a regression in moves from split/collect. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	896dc63623	pan/bi: Lower phis to scalar If we don't lower phis to scalar, when we go out of SSA, we can get vector nir_registers. In particular, we can get code like: r0 = vec2 r0.y, r0.x This code looks like a move, but is in fact a swap. The trivial lowering of vec2 would not work -- the following fails to swap correctly: r0.x = r0.y r0.y = r0.x Currently, we generate temporaries to handle these cases. It's easy to move the complexity to NIR, though, and we'll want to scalarize phis for SSA-based RA anyway. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	c8882ee115	pan/bi: +JUMP can't read same-cycle temp Minor ISA detail missed in the Bifrost scheduler. I hit this in an early version of this series (where a move feeding into a blend shader return was not coalesced). Let's get it fixed in the scheduler. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	c387096eca	pan/va: Use 64-bit lowering for texturing Texture instructions on Valhall take 64-bit sources. Now that we have infrastructure to handle this properly, we don't need to use a non-SSA node to hack around the optimization. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	89a3746bc1	pan/va: Lower split 64-bit sources This ensures Valhall 64-bit constraints are respected in a simple way. It's not the most efficient, though. Optimization is deferred until full Valhall support is upstreamed and the RA is overhauled. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	65839d8c3c	pan/va: Mark more source sizes This source size information will be consumed by the 64-bit lowering pass, so ensure it's accurate. That means marking 32-bit and 64-bit sources explicitly on message passing where it wouldn't match up with the type size suffix of the instruction. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Alyssa Rosenzweig	04a1df8c65	pan/bi: Update bi_count_write_registers for Valhall We add some new instructions on Valhall with special register requirements (texturing, atomics). Handle these appropriately so we can do RA on Valhall. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16585>	2022-05-19 16:08:26 +00:00
Jason Ekstrand	fc8d2543fc	vulkan,v3dv: Add a driver_internal flag to vk_image_view_init/create We already had a little workaround for v3dv where, for some if its meta ops, it had to bind a depth/stenicil image as color. Instead of special-casing binding depth/stencil as color, let's flip on the drier_internal flag and get rid of most of the checks in that case. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16376>	2022-05-17 18:14:55 +00:00
Tomeu Vizoso	9e031426be	panvk/ci: Disable CI for a while We have been hitting OOM conditions quite often and this is making ti hard to get stuff merged. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16554>	2022-05-17 09:16:21 +00:00
Timothy Arceri	d7a071a28f	gallium/drivers: set force_indirect_unrolling_sampler for all required drivers This is set to true for all drivers that have a GLSL level of support lower than 4.00. This matches the rule for setting the GLSL IR option EmitNoIndirectSampler. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16543>	2022-05-17 02:12:21 +00:00
Jason Ekstrand	9e22e2ac88	panvk: Lower blending after lower_var_copies nir_lower_blend needs store_deref as does io_arrays_to_elements_no_indirects. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16483>	2022-05-16 21:43:47 +00:00
Jason Ekstrand	4050697a8f	panvk: So more nir_lower_tex before descriptor lowering Some texture lowering generates more txs which means it needs to happen before we lower descriptors because descriptor lowering is where txs is actually handled in panvk. Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16483>	2022-05-16 21:43:47 +00:00
Jason Ekstrand	36bb62139e	bifrost: Run nir_lower_global_vars_to_local before nir_lower_vars_to_scratch Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16483>	2022-05-16 21:43:47 +00:00
Timothy Arceri	7647023f3b	glsl: enable the use of the nir based varying linker Here as well as calling the pass we need to switch the order of some of the information gathering and optimisation calls. We also need to create a custom callback for the dead variables removal pass to clean up dead builtin varying in SSO programs without causing piglit regressions. shader-db results IRIS (BDW): total instructions in shared programs: 17487900 -> 17477072 (-0.06%) instructions in affected programs: 128682 -> 117854 (-8.41%) helped: 587 HURT: 82 helped stats (abs) min: 1 max: 145 x̄: 18.82 x̃: 20 helped stats (rel) min: 0.21% max: 77.78% x̄: 17.41% x̃: 8.85% HURT stats (abs) min: 1 max: 6 x̄: 2.68 x̃: 2 HURT stats (rel) min: 0.25% max: 9.76% x̄: 2.94% x̃: 2.16% 95% mean confidence interval for instructions value: -17.71 -14.66 95% mean confidence interval for instructions %-change: -16.40% -13.42% Instructions are helped. total cycles in shared programs: 857442520 -> 857170199 (-0.03%) cycles in affected programs: 112252720 -> 111980399 (-0.24%) helped: 13733 HURT: 13349 helped stats (abs) min: 1 max: 7293 x̄: 81.44 x̃: 10 helped stats (rel) min: <.01% max: 90.32% x̄: 3.30% x̃: 0.62% HURT stats (abs) min: 1 max: 7424 x̄: 63.38 x̃: 8 HURT stats (rel) min: <.01% max: 192.23% x̄: 3.28% x̃: 0.54% 95% mean confidence interval for cycles value: -14.01 -6.10 95% mean confidence interval for cycles %-change: -0.17% 0.06% Inconclusive result (%-change mean confidence interval includes 0). total sends in shared programs: 971443 -> 970010 (-0.15%) sends in affected programs: 4596 -> 3163 (-31.18%) helped: 446 HURT: 39 helped stats (abs) min: 1 max: 6 x̄: 3.40 x̃: 4 helped stats (rel) min: 3.03% max: 85.71% x̄: 46.48% x̃: 50.00% HURT stats (abs) min: 1 max: 3 x̄: 2.15 x̃: 2 HURT stats (rel) min: 6.67% max: 25.00% x̄: 15.16% x̃: 10.53% 95% mean confidence interval for sends value: -3.13 -2.78 95% mean confidence interval for sends %-change: -44.16% -38.88% Sends are helped. LOST: 235 GAINED: 262 Shader-db results radeonsi (RX580): 169505 shaders in 102144 tests Totals: SGPRS: 7698832 -> 7696552 (-0.03 %) VGPRS: 5547296 -> 5545280 (-0.04 %) Spilled SGPRs: 14795 -> 14773 (-0.15 %) Spilled VGPRs: 3782 -> 3782 (0.00 %) Private memory VGPRs: 1152 -> 1152 (0.00 %) Scratch size: 3872 -> 3872 (0.00 %) dwords per thread Code Size: 162946528 -> 162895264 (-0.03 %) bytes Max Waves: 2449334 -> 2449736 (0.02 %) Totals from affected shaders: SGPRS: 215024 -> 212744 (-1.06 %) VGPRS: 151976 -> 149960 (-1.33 %) Spilled SGPRs: 162 -> 140 (-13.58 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 5249916 -> 5198652 (-0.98 %) bytes Max Waves: 54588 -> 54990 (0.74 %) Panfrost trace checksum is updated as per discussion in: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6343 Some virpipe tess shader piglit tests are added as failures to CI these failures are not a regression but an uncovered existing bug exposed due to the linker no longer sorting internally facing shader interfaces in alphabetical order. See details in: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6481 Acked-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15731>	2022-05-16 03:33:18 +00:00
Jason Ekstrand	5ef9bd5ff2	panvk: Round FillBuffer sizes down to a multiple of 4 Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	ad05bc9315	panvk: Drop panvk_descriptor The API-style representation of descriptors is no longer used by anything so let's get rid of it. All we really need is the data in the descriptor set itself. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	d783f8949e	panvk: Implement descriptor copies properly All we were doing was copying panvk_descriptor structs around which don't actually contain data that's used by anything interesting. We need to copy the actual data arround. Annoyingly, that means we need a descriptor copy function per descriptor type. Woo! Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	f6268220c2	panvk: Set immutable samplers properly up-front Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	935fd18bc3	panvk: Rewrite the write portion of vkUpdateDescriptorSets The new design is based on the ANV code which I massively cleaned up some time ago. Each descriptor type has a write function and they have consistent prototypes. This makes it all much easier to read and figure out what's going on. It also makes it easier to make changes going forward because you aren't re-plumbing function arguments if you ever change the type of data in any given descriptor type. You just change the write function. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	53f53b577f	panvk: Re-arrange descriptor set functions Put them in the order we call them which is also roughly descriptor type enum order. Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	28333e039c	FIXUP: Use 16-bit things for texture sizes Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	38a0742f6a	panvk: Implement texture/image queries Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	714e125ae4	pavnk: Pass bind layouts to texture and image descriptor helpers Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	6ed298dce7	panvk: Add an elems field to panvk_buffer_view Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	6621ab8bf9	panvk: Advertise VK_KHR_variable_pointers Now that our SSBO descriptor handling code no longer craws deref chains back to the variable, we should be handling variable pointers properly. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	d9f9955f9e	panvk: Enable robustBufferAccess It should already work for UBOs. This should do everything we need for SSBOs. Not sure about vertex and index buffers but we can deal with those later. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	a463c58e22	panvk: Put SSBO addresses in the descriptor buffer Instead of storing SSBO pointers in the very limited sysval space, store them in the UBO we've attached to the descriptor set. This gives us a virtually unlimited number of SSBOs. Dynamic SSBOs still live in the sysval space so we can update them as part of vkCmdBindDescriptorSets(). Also, the new code (based on the code in ANV) loads those SSBO addresses in a way that never chases the deref chain back to the variable so we should now be able to handle all of variable pointers. The code as written in this patch is a bit overly generic because it switches on address modes a bit more than panvk needs but we ended up needing all that flexibility in ANV so we may as well leave hooks for it in panvk. Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	e265583ee1	panvk: Interleave UBOs with multiple descriptor sets The original intention was to put all the non-dynamic UBOs first followed by all the dynamic ones. However, we got the calculations wrong and, once you went above one descriptor set, things start stomping each other. Also, the whole strategy is a bit busted. Vulkan pipeline layout compatability rules say that it's ok to create a pipeline with one layout and then bind with another so long as the bottom N descriptor set layouts match and the pipeline uses at most N descriptors. This means that, while it's safe to have each subsequent set add onto a given pool of descriptors, if you're going to combine two of those pools, you need to be careful that the position of descriptors in set N only depends on the layouts of sets M <= N. The easy way to do this is to interleve where we do the UBOs for set 0 then dynamic for set 0 then UBOs for set 1 then dynamic for set 1, etc. Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	6d15d65e19	panvk: Put the sysval and push const UBOs at fixed indices In theory, this may cost us a tiny bit of descriptor space but in practice, given that the viewport transform is a sysval, we'll always need it for 3D and given that SSBO pointers live there, we'll basically always need it for compute. It also makes a lot of things simpler. We're about to start using the sysval UBO directly in our descriptor set code and knowing the index up-front is really nice. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	744b977963	panvk: Stop calling lower_uniforms_to_ubo We don't need it because Vulkan doesn't have GL-style uniforms. It shouldn't be doing anything but sometimes it inserts an extra UBO binding and adds 1 to all our UBO indices for no good reason. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	c32ddb5e77	panvk: Use a flat sysvals struct PanVK uses fewer sysvals than the GLES driver, as some data that would be a data in GLES is instead part of the descriptor set or the pipeline state in Vulkan. Therefore, it is simpler and more efficient to use a flat, fixed layout provided by the driver for our sysvals, rather than the compiler choosing a layout. This commit switches to a flat sysval layout. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	e6091cc578	panvk: Get rid of the per-pipeline sysvals BO This is a micro-optimization and probably not a correct one at that. The cost involved in re-uploading the viewport is tiny compared to the mental overhead from trying to do this juggle. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	f0a47d8602	bifrost,midgard: Allow providing a fixed sysval layout Vulkan doesn't need nearly as many system values and would like to bake its layout up-front instead of having it provided by the back-end compiler. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:16 +00:00
Jason Ekstrand	e07a296398	panfrost: Add some sanity checking for sysvals Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:15 +00:00
Jason Ekstrand	4e60f0655a	panfrost,panvk: Make fixed_sysval_ubo < 0 mean compiler-assigned In `3559efb9bf` ("panfrost: Allow passing an explicit UBO index for the sysval UBO"), an explicit UBO index was added and it was implicitly assumed that it would be > num_ubos. This was convenient because it meant 0, the default for designated initializers, implicitly meant compiler-assigned. However, we're about to move the sysval UBO to 0 which breaks this assumption. Also, we don't want the back-end compiler to even look at num_ubos since it's meaningless in Vulkan. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:15 +00:00
Jason Ekstrand	42aca84704	panvk: Add a buffer to each descriptor set Later in the series, we will map descriptor sets to driver-internal buffers bound as UBOs. These buffers will contain various internal data, like buffer and texture sizes. Resource access will be lowered to pull from this UBO in the shader. To prepare, create a backing buffer when creating descriptor set and emit a UBO record so we can bind it. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:15 +00:00
Jason Ekstrand	bcea5ed2b6	panvk: Break descriptor lowering into its own file It's about to get a lot more complicated so let's split it out. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:15 +00:00
Jason Ekstrand	8af805a475	panvk: Move CreateDescriptorSetLayout to per-arch Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16276>	2022-05-12 10:53:15 +00:00

1 2 3 4 5 ...

4240 Commits