KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Rhys Perry	7c63ec70ef	nir: document that ACCESS_RESTRICT is not set at intrinsics Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7295>	2021-06-10 13:17:22 +00:00
Rhys Perry	938098c98d	nir/opt_load_store_vectorize: only require one variable to be restrict No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7295>	2021-06-10 13:17:22 +00:00
Rhys Perry	865ca3af2b	nir/opt_load_store_vectorize: check for restrict at the variable SPIR-V -> NIR doesn't set ACCESS_RESTRICT at the intrinsic. fossil-db (GFX10.3): Totals from 3 (0.00% of 139391) affected shaders: CodeSize: 12364 -> 12356 (-0.06%) Instrs: 2493 -> 2494 (+0.04%); split: -0.04%, +0.08% Cycles: 15279372 -> 15295756 (+0.11%); split: -0.11%, +0.21% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7295>	2021-06-10 13:17:22 +00:00
Rhys Perry	2e7bceb220	nir/load_store_vectorizer: fix check_for_robustness() with indirect loads fossil-db (GFX10.3, robustness2 enabled): Totals from 13958 (9.54% of 146267) affected shaders: VGPRs: 609168 -> 624304 (+2.48%); split: -0.05%, +2.53% CodeSize: 48229504 -> 48488392 (+0.54%); split: -0.02%, +0.56% MaxWaves: 354426 -> 349448 (-1.40%); split: +0.00%, -1.41% Instrs: 9332093 -> 9375053 (+0.46%); split: -0.03%, +0.49% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7295>	2021-06-10 13:17:22 +00:00
Timur Kristóf	1e49018ced	amd: Add extra source to the mbcnt_amd NIR intrinsic. The v_mbcnt instructions can take an extra source that they add to the result. This is not exposed in SPIR-V but we now expose it in NIR. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Timur Kristóf	43ce80a58f	nir: Add AMD-specific byte and lane permute intrinsics. These map directly to v_perm_b32 and v_permlane_b32. Unfortunately there is no corresponding NIR opcode or intrinsics, and it's too tedious to puzzle these things together from the existing NIR instructions. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Timur Kristóf	c92dab8e2b	nir: Add nir_op_sad_u8x4 which corresponds to AMD's v_sad_u8. NIR currently doesn't have any intrinsics for a horizontal packed add, so this one is modeled after AMD's v_sad_u8. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Tony Wasserka <tony.wasserka@gmx.de> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11072>	2021-06-09 16:48:51 +00:00
Caio Marcelo de Oliveira Filho	e94c99513a	nir/gather_info: Rename per_vertex to is_arrayed Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11252>	2021-06-09 07:35:57 +00:00
Caio Marcelo de Oliveira Filho	a59f1d628a	nir/lower_io: Rename vertex_index to array_index in helpers The helpers will be reused for per-primitive variables that are also arrayed, so use a more general name. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11252>	2021-06-09 07:35:57 +00:00
Alyssa Rosenzweig	95bd6e915f	nir/lower_fragcolor: Avoid redundant load_output At best, this is an extra instruction for NIR to optimize out. At worst, depending on pass ordering nir_load_output could sneak into the final NIR, even on drivers that don't support fbfetch. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11255>	2021-06-09 02:58:08 +00:00
Caio Marcelo de Oliveira Filho	8af6766062	nir: Move workgroup_size and workgroup_variable_size into common shader_info Move it out the "cs" sub-struct, since these will be used for other shader stages in the future. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225>	2021-06-08 09:23:55 -07:00
Caio Marcelo de Oliveira Filho	b5f6fc442c	nir: Move zero_initialize_shared_memory into common shader_info Move it out the "cs" sub-struct, since the bit will be used for other shader stages in the future. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225>	2021-06-08 09:23:55 -07:00
Bas Nieuwenhuizen	6b7ff241f4	nir/lower_returns: Deal with single-arg phis after if. if we have if ... { return; } else { // block X } // block Y phi(X: ...) then nir_lower_returns tries to move block Y into the else body, except nir_cf_extract doesn't move the phi. As the return is removed in the then-body the phi suddenly has the wrong number of arguments (and the phi doesn't dominate its uses anymore). In this case we know that the phi has to be single arg, so we can just rewrite the users of the phis and drop them. Hit this in my RT adventures, not sure if this is actually reachable right now, as single arg phis tend to be kind of exceptional outside of CSSA and we typically call nir_lower_returns pretty early. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11207>	2021-06-08 11:29:53 +00:00
Rhys Perry	1cbcfb8b38	nir, nir/algebraic: add byte/word insertion instructions Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Rhys Perry	edae3e5623	nir/algebraic: optimize extract of extract Found in some sottr shaders (originally iand(ishr(a, 16), 0xffff)) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3151>	2021-06-08 08:57:42 +00:00
Caio Marcelo de Oliveira Filho	c8a7bd0dc8	nir: Rename WORK_GROUP (and similar) to WORKGROUP Be consistent with other usages in Vulkan and SPIR-V, and the recently added workgroup_size field. Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho	a71a780598	nir: Rename nir_intrinsic_load_local_group_size to nir_intrinsic_load_workgroup_size Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho	43a6a2151b	compiler: Rename SYSTEM_VALUE_LOCAL_GROUP_SIZE to SYSTEM_VALUE_WORKGROUP_SIZE Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Caio Marcelo de Oliveira Filho	430d2206da	compiler: Rename local_size to workgroup_size Acked-by: Emma Anholt <emma@anholt.net> Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>	2021-06-07 22:34:42 +00:00
Alyssa Rosenzweig	c509878971	nir: Add nir_intrinsic_load_back_face_agx On AGX, the special register for front facing is inverted from its meaning in APIs. We need to lower load_front_face to inot(load_back_face). Doing this in the backend is trivial, but then we would miss out on algebraic optimizations for the inot. Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Acked-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11199>	2021-06-05 20:38:22 +00:00
Hoe Hao Cheng	90a5fef85c	nir: define NIR_ALU_MAX_INPUTS Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11172>	2021-06-04 19:33:13 +00:00
Rhys Perry	49add985ff	nir/unsigned_upper_bound: don't require dominance metadata Instead, determine if it's a merge or loop exit phi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9808>	2021-06-04 14:14:00 +00:00
Mike Blumenkrantz	1199d86b2c	compiler/spirv: expand_to_vec4 -> nir_pad_vec4 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10821>	2021-05-31 18:45:24 +00:00
Mike Blumenkrantz	f9ecbb1e1d	nir/builder: add nir_mask it's handy to have functions for generating masks Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10620>	2021-05-26 04:06:27 +00:00
Timothy Arceri	8b180ab98b	nir/lower_io_to_vector: fix per vertex io handling for arrays The pass was processing the per vertex index from the wrong end of the array deref chain. Fixes: `bcd14756ee` ("nir/lower_io_to_vector: add flat mode") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10798>	2021-05-21 02:43:30 +00:00
Ian Romanick	880b00dc59	nir/lower_tex: Add support for lowering YUYV formats v2: Rebase on `bc438c91d9` ("nir/lower_tex: ignore texture_index if tex_instr has deref src") Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9610>	2021-05-21 01:40:22 +00:00
Ian Romanick	1358d93650	nir/lower_tex: Add support for lowering Y41x formats These are similar to AYUV, but the channel ordering is different... in such a way that there's no RGBA format that will make the channels line up right. v2: Rebase on `bc438c91d9` ("nir/lower_tex: ignore texture_index if tex_instr has deref src") Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9610>	2021-05-21 01:40:22 +00:00
Alyssa Rosenzweig	4d44d4179e	glsl: Fix packing of matrices for XFB The CAP for packed transform feedback concerns packing of unrelated variables into the same varying slot. (On Mali, transform feedback is implemented on a per-slot basis, so different variables need different slots to be written to different buffers.) However, this requirement is tangential to the packing of arrays, matrices, and structures inherent to GLSL. These array-like values need to be packed /within/ their slot, even though drivers using the CAP (just Panfrost) cannot pack independent values in the slot. Transform feedback of individual elements is not independent, after all. Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10778>	2021-05-20 10:05:39 +00:00
Alyssa Rosenzweig	538ab8c571	glsl: Fix subscripted arrays with no XFB packing We need to duplicate the subscripted members even if they happen to be aligned, since the other elements may be passed into the consumer shader. Fixes on Panfrost: dEQP-GLES3.functional.transform_feedback.array_element.interleaved.lines.highp_float Note: the test did pass on main previously due to an elaborate set of driver hacks. I don't believe the old behaviour was correct regardless. Only Panfrost is affected by this change and the next, as every other driver sets PIPE_CAP_PACKED_STREAM_OUTPUT. Acked-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10778>	2021-05-20 10:05:39 +00:00
Ian Romanick	d246c31ec1	nir/algebraic: Add algebraic opt for float comparisons with identical operands. The flt version could have been added in `56e21647e2`, but our collective understanding of NaN and comparisons was poor in 2015. The new "is_a_number" predicate makes the others possible. All of the helped shaders in shader-db are either from Mad Max or Skia. Some of the Skia shaders just get decimated by this change: instructions helped: shaders/skia/580-4.shader_test FS SIMD8: 81 -> 29 (-64.20%) (scheduled: top-down) I looked at a couple of those shaders, and they had sequences like: vec1 32 ssa_44 = flt32 ssa_32, ssa_32 vec1 32 ssa_45 = b32csel ssa_44, ssa_43, ssa_0 vec1 32 ssa_46 = fge32 ssa_32, ssa_32 vec1 32 ssa_47 = b32csel ssa_46, ssa_0, ssa_45 vec1 32 ssa_48 = iand ssa_46, ssa_44 vec1 32 ssa_49 = b32csel ssa_48, ssa_43, ssa_0 ssa_44 is replaced with False. Then ssa_47 selects between ssa_0 and ssa_0, so ssa_47 and ssa_46 are eliminated. ssa_48 is (False && don't care), so ssa_48 and ssa_49 are eliminated. After that, many calculations now involve constants of zero, so they are optimized down too. So it continues until there's not much left! Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21072238 -> 21071386 (<.01%) instructions in affected programs: 33722 -> 32870 (-2.53%) helped: 146 HURT: 1 helped stats (abs) min: 1 max: 62 x̄: 5.84 x̃: 2 helped stats (rel) min: 0.19% max: 62.35% x̄: 4.09% x̃: 1.07% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.20% max: 0.20% x̄: 0.20% x̃: 0.20% 95% mean confidence interval for instructions value: -7.94 -3.65 95% mean confidence interval for instructions %-change: -5.87% -2.25% Instructions are helped. total cycles in shared programs: 856203326 -> 856192238 (<.01%) cycles in affected programs: 749966 -> 738878 (-1.48%) helped: 148 HURT: 0 helped stats (abs) min: 1 max: 1226 x̄: 74.92 x̃: 18 helped stats (rel) min: 0.07% max: 49.70% x̄: 2.69% x̃: 0.46% 95% mean confidence interval for cycles value: -104.82 -45.02 95% mean confidence interval for cycles %-change: -4.01% -1.37% Cycles are helped. LOST: 4 GAINED: 0 Fossil-db results: Tiger Lake Instructions in all programs: 160915223 -> 160898354 (-0.0%) SENDs in all programs: 6812780 -> 6812780 (+0.0%) Loops in all programs: 38340 -> 38340 (+0.0%) Cycles in all programs: 7434144207 -> 7433978462 (-0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304537 -> 304537 (+0.0%) Ice Lake Instructions in all programs: 145296298 -> 145279531 (-0.0%) SENDs in all programs: 6863692 -> 6863692 (+0.0%) Loops in all programs: 38334 -> 38334 (+0.0%) Cycles in all programs: 8800257014 -> 8800088384 (-0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334248 -> 334248 (+0.0%) Skylake Instructions in all programs: 135891664 -> 135874910 (-0.0%) SENDs in all programs: 6802946 -> 6802946 (+0.0%) Loops in all programs: 38331 -> 38331 (+0.0%) Cycles in all programs: 8444273433 -> 8444130932 (-0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301114 -> 301114 (+0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	64bcfc3a17	nir/algebraic: Rearrange some logic-joined comparisons and reduce On Skylake and Broadwell, a single big compute shader in Dirt Rally has spills and fills REALLY helped. That same shader is hurt very slightly for spills and fills on Ice Lake. v2: Move the patterns earlier to be nearer other patterns that are similar. Mark the replacement fmin and fmax exact. Both suggested by Rhys. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake total instructions in shared programs: 21073812 -> 21073041 (<.01%) instructions in affected programs: 77608 -> 76837 (-0.99%) helped: 522 HURT: 33 helped stats (abs) min: 1 max: 26 x̄: 1.58 x̃: 1 helped stats (rel) min: 0.22% max: 14.29% x̄: 1.29% x̃: 1.02% HURT stats (abs) min: 1 max: 8 x̄: 1.67 x̃: 1 HURT stats (rel) min: 0.25% max: 3.42% x̄: 1.06% x̃: 0.86% 95% mean confidence interval for instructions value: -1.57 -1.20 95% mean confidence interval for instructions %-change: -1.25% -1.05% Instructions are helped. total cycles in shared programs: 856224346 -> 856211096 (<.01%) cycles in affected programs: 2394231 -> 2380981 (-0.55%) helped: 603 HURT: 25 helped stats (abs) min: 1 max: 5218 x̄: 59.37 x̃: 28 helped stats (rel) min: 0.06% max: 5.61% x̄: 1.52% x̃: 1.37% HURT stats (abs) min: 2 max: 21394 x̄: 901.92 x̃: 10 HURT stats (rel) min: 0.02% max: 5.90% x̄: 0.95% x̃: 0.59% 95% mean confidence interval for cycles value: -93.61 51.41 95% mean confidence interval for cycles %-change: -1.50% -1.34% Inconclusive result (value mean confidence interval includes 0). LOST: 1 GAINED: 1 Ice Lake total instructions in shared programs: 20025692 -> 20024554 (<.01%) instructions in affected programs: 104981 -> 103843 (-1.08%) helped: 738 HURT: 0 helped stats (abs) min: 1 max: 30 x̄: 1.54 x̃: 1 helped stats (rel) min: 0.31% max: 10.53% x̄: 1.20% x̃: 1.06% 95% mean confidence interval for instructions value: -1.66 -1.43 95% mean confidence interval for instructions %-change: -1.26% -1.14% Instructions are helped. total cycles in shared programs: 979474407 -> 979422333 (<.01%) cycles in affected programs: 4136364 -> 4084290 (-1.26%) helped: 759 HURT: 59 helped stats (abs) min: 2 max: 11010 x̄: 72.78 x̃: 28 helped stats (rel) min: 0.03% max: 6.43% x̄: 1.23% x̃: 1.02% HURT stats (abs) min: 1 max: 698 x̄: 53.66 x̃: 8 HURT stats (rel) min: 0.02% max: 24.05% x̄: 1.64% x̃: 0.33% 95% mean confidence interval for cycles value: -97.08 -30.24 95% mean confidence interval for cycles %-change: -1.14% -0.91% Cycles are helped. total spills in shared programs: 10568 -> 10569 (<.01%) spills in affected programs: 102 -> 103 (0.98%) helped: 0 HURT: 1 total fills in shared programs: 11347 -> 11349 (0.02%) fills in affected programs: 277 -> 279 (0.72%) helped: 0 HURT: 1 LOST: 2 GAINED: 2 Skylake total instructions in shared programs: 18190419 -> 18188523 (-0.01%) instructions in affected programs: 102502 -> 100606 (-1.85%) helped: 791 HURT: 0 helped stats (abs) min: 1 max: 676 x̄: 2.40 x̃: 1 helped stats (rel) min: 0.34% max: 20.23% x̄: 1.41% x̃: 1.23% 95% mean confidence interval for instructions value: -4.07 -0.72 95% mean confidence interval for instructions %-change: -1.47% -1.34% Instructions are helped. total cycles in shared programs: 960737969 -> 960498951 (-0.02%) cycles in affected programs: 4435351 -> 4196333 (-5.39%) helped: 804 HURT: 67 helped stats (abs) min: 1 max: 198540 x̄: 300.54 x̃: 24 helped stats (rel) min: 0.03% max: 25.41% x̄: 1.21% x̃: 0.92% HURT stats (abs) min: 2 max: 680 x̄: 39.06 x̃: 6 HURT stats (rel) min: 0.05% max: 23.98% x̄: 1.12% x̃: 0.19% 95% mean confidence interval for cycles value: -722.03 173.20 95% mean confidence interval for cycles %-change: -1.15% -0.91% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 9757 -> 9722 (-0.36%) spills in affected programs: 138 -> 103 (-25.36%) helped: 1 HURT: 0 total fills in shared programs: 9861 -> 9576 (-2.89%) fills in affected programs: 564 -> 279 (-50.53%) helped: 1 HURT: 0 LOST: 5 GAINED: 2 Broadwell total instructions in shared programs: 17853870 -> 17852414 (<.01%) instructions in affected programs: 101276 -> 99820 (-1.44%) helped: 777 HURT: 0 helped stats (abs) min: 1 max: 264 x̄: 1.87 x̃: 1 helped stats (rel) min: 0.34% max: 8.44% x̄: 1.37% x̃: 1.23% 95% mean confidence interval for instructions value: -2.54 -1.21 95% mean confidence interval for instructions %-change: -1.42% -1.32% Instructions are helped. total cycles in shared programs: 1029846029 -> 1029725458 (-0.01%) cycles in affected programs: 4435791 -> 4315220 (-2.72%) helped: 813 HURT: 43 helped stats (abs) min: 2 max: 68560 x̄: 149.95 x̃: 24 helped stats (rel) min: 0.02% max: 73.73% x̄: 1.43% x̃: 0.92% HURT stats (abs) min: 2 max: 726 x̄: 31.12 x̃: 13 HURT stats (rel) min: 0.01% max: 8.43% x̄: 0.62% x̃: 0.31% 95% mean confidence interval for cycles value: -299.58 17.87 95% mean confidence interval for cycles %-change: -1.63% -1.02% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 20333 -> 20307 (-0.13%) spills in affected programs: 151 -> 125 (-17.22%) helped: 1 HURT: 0 total fills in shared programs: 25899 -> 25775 (-0.48%) fills in affected programs: 573 -> 449 (-21.64%) helped: 1 HURT: 0 LOST: 5 GAINED: 0 Sandy Bridge, Ivy Bridge, and Haswell had similar results. (Haswell shown) total instructions in shared programs: 16417658 -> 16416320 (<.01%) instructions in affected programs: 96495 -> 95157 (-1.39%) helped: 774 HURT: 0 helped stats (abs) min: 1 max: 18 x̄: 1.73 x̃: 1 helped stats (rel) min: 0.33% max: 9.80% x̄: 1.52% x̃: 1.20% 95% mean confidence interval for instructions value: -1.83 -1.63 95% mean confidence interval for instructions %-change: -1.59% -1.46% Instructions are helped. total cycles in shared programs: 1037104346 -> 1037080579 (<.01%) cycles in affected programs: 3787747 -> 3763980 (-0.63%) helped: 791 HURT: 53 helped stats (abs) min: 1 max: 5411 x̄: 65.87 x̃: 32 helped stats (rel) min: 0.02% max: 21.17% x̄: 1.44% x̃: 1.18% HURT stats (abs) min: 2 max: 14160 x̄: 534.72 x̃: 18 HURT stats (rel) min: 0.02% max: 15.37% x̄: 5.70% x̃: 0.54% 95% mean confidence interval for cycles value: -69.39 13.07 95% mean confidence interval for cycles %-change: -1.19% -0.80% Inconclusive result (value mean confidence interval includes 0). LOST: 12 GAINED: 2 GM45 and Iron Lake had similar results. (Iron Lake shown) total instructions in shared programs: 8132855 -> 8132703 (<.01%) instructions in affected programs: 8782 -> 8630 (-1.73%) helped: 38 HURT: 0 helped stats (abs) min: 4 max: 4 x̄: 4.00 x̃: 4 helped stats (rel) min: 1.66% max: 3.23% x̄: 1.77% x̃: 1.72% 95% mean confidence interval for instructions value: -4.00 -4.00 95% mean confidence interval for instructions %-change: -1.88% -1.65% Instructions are helped. total cycles in shared programs: 238300850 -> 238298568 (<.01%) cycles in affected programs: 257202 -> 254920 (-0.89%) helped: 62 HURT: 2 helped stats (abs) min: 4 max: 58 x̄: 36.90 x̃: 50 helped stats (rel) min: 0.15% max: 1.55% x̄: 0.87% x̃: 1.12% HURT stats (abs) min: 2 max: 4 x̄: 3.00 x̃: 3 HURT stats (rel) min: 0.12% max: 0.22% x̄: 0.17% x̃: 0.17% 95% mean confidence interval for cycles value: -41.34 -29.98 95% mean confidence interval for cycles %-change: -0.95% -0.73% Cycles are helped. Fossil-db results: All Intel platforms had similar results. (Ice Lake shown) Instructions in all programs: 145296888 -> 145296346 (-0.0%) SENDs in all programs: 6863696 -> 6863696 (+0.0%) Loops in all programs: 38334 -> 38334 (+0.0%) Cycles in all programs: 8800262303 -> 8800258950 (-0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334248 -> 334248 (+0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	adc2835646	nir/algebraic: Mark some more logic-joined comparison reductions as exact If the values are known to be numbers, the the replacements are exact. This is only applied to the patterns with constants. Constants should always be numbers, and shaders with NaN constants should be handled in a different way. No shader-db or fossil-db changes on any Intel platform. The intention is to make these patterns more future proof. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	23bbf3932b	nir/algebraic: Mark some more comparison reductions exact Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Haswell and later Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21049056 -> 21048939 (<.01%) instructions in affected programs: 4716 -> 4599 (-2.48%) helped: 39 HURT: 0 helped stats (abs) min: 1 max: 6 x̄: 3.00 x̃: 3 helped stats (rel) min: 0.99% max: 5.43% x̄: 2.80% x̃: 2.51% 95% mean confidence interval for instructions value: -3.46 -2.54 95% mean confidence interval for instructions %-change: -3.22% -2.38% Instructions are helped. total cycles in shared programs: 855141411 -> 855141159 (<.01%) cycles in affected programs: 54491 -> 54239 (-0.46%) helped: 28 HURT: 5 helped stats (abs) min: 2 max: 34 x̄: 12.82 x̃: 12 helped stats (rel) min: 0.06% max: 2.73% x̄: 0.94% x̃: 0.75% HURT stats (abs) min: 2 max: 52 x̄: 21.40 x̃: 6 HURT stats (rel) min: 0.11% max: 2.46% x̄: 0.90% x̃: 0.56% 95% mean confidence interval for cycles value: -13.72 -1.55 95% mean confidence interval for cycles %-change: -1.01% -0.31% Cycles are helped. Tiger Lake Instructions in all programs: 160902191 -> 160899554 (-0.0%) SENDs in all programs: 6812435 -> 6812435 (+0.0%) Loops in all programs: 38225 -> 38225 (+0.0%) Cycles in all programs: 7428581420 -> 7428555881 (-0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304539 -> 304539 (+0.0%) A lot of fragment shaders in Shadow of the Tomb Raider were helped, and a bunch of vertex shaders in Octopath Traveler were hurt. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	7d85dc4f35	nir/algebraic: Equality comparison inversions require sources be numbers v2: Update A630 expected image checksum for minetest.trace. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake total instructions in shared programs: 21036690 -> 21049485 (0.06%) instructions in affected programs: 852085 -> 864880 (1.50%) helped: 240 HURT: 2514 helped stats (abs) min: 1 max: 46 x̄: 2.45 x̃: 2 helped stats (rel) min: 0.15% max: 4.30% x̄: 0.79% x̃: 0.55% HURT stats (abs) min: 1 max: 198 x̄: 5.32 x̃: 2 HURT stats (rel) min: 0.06% max: 10.71% x̄: 1.48% x̃: 1.04% 95% mean confidence interval for instructions value: 4.14 5.15 95% mean confidence interval for instructions %-change: 1.23% 1.34% Instructions are HURT. total cycles in shared programs: 856045255 -> 855816220 (-0.03%) cycles in affected programs: 16743786 -> 16514751 (-1.37%) helped: 790 HURT: 1973 helped stats (abs) min: 1 max: 10766 x̄: 627.97 x̃: 18 helped stats (rel) min: <.01% max: 32.59% x̄: 3.01% x̃: 0.64% HURT stats (abs) min: 1 max: 4078 x̄: 135.36 x̃: 18 HURT stats (rel) min: <.01% max: 54.56% x̄: 2.80% x̃: 0.82% 95% mean confidence interval for cycles value: -131.36 -34.42 95% mean confidence interval for cycles %-change: 0.88% 1.40% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total spills in shared programs: 9771 -> 9766 (-0.05%) spills in affected programs: 47 -> 42 (-10.64%) helped: 1 HURT: 0 total fills in shared programs: 9451 -> 9430 (-0.22%) fills in affected programs: 91 -> 70 (-23.08%) helped: 1 HURT: 0 LOST: 16 GAINED: 51 All Intel GPUs from Sandybridge through Ice Lake had similar results. (Ice Lake shown) total instructions in shared programs: 20024781 -> 20025568 (<.01%) instructions in affected programs: 103309 -> 104096 (0.76%) helped: 12 HURT: 389 helped stats (abs) min: 1 max: 2 x̄: 1.17 x̃: 1 helped stats (rel) min: 0.20% max: 2.70% x̄: 1.36% x̃: 1.37% HURT stats (abs) min: 1 max: 8 x̄: 2.06 x̃: 1 HURT stats (rel) min: 0.05% max: 7.14% x̄: 1.25% x̃: 0.95% 95% mean confidence interval for instructions value: 1.78 2.15 95% mean confidence interval for instructions %-change: 1.06% 1.28% Instructions are HURT. total cycles in shared programs: 979419070 -> 979439180 (<.01%) cycles in affected programs: 4968711 -> 4988821 (0.40%) helped: 60 HURT: 381 helped stats (abs) min: 1 max: 1296 x̄: 96.92 x̃: 26 helped stats (rel) min: <.01% max: 27.10% x̄: 1.64% x̃: 0.65% HURT stats (abs) min: 1 max: 7320 x̄: 68.04 x̃: 30 HURT stats (rel) min: <.01% max: 19.77% x̄: 1.32% x̃: 0.87% 95% mean confidence interval for cycles value: 10.25 80.95 95% mean confidence interval for cycles %-change: 0.69% 1.15% Cycles are HURT. LOST: 1 GAINED: 2 GM45 and Iron Lake had similar results. (Iron Lake shown) total instructions in shared programs: 8128474 -> 8132527 (0.05%) instructions in affected programs: 642323 -> 646376 (0.63%) helped: 12 HURT: 1972 helped stats (abs) min: 1 max: 4 x̄: 3.00 x̃: 4 helped stats (rel) min: 0.72% max: 1.72% x̄: 1.09% x̃: 0.83% HURT stats (abs) min: 1 max: 16 x̄: 2.07 x̃: 3 HURT stats (rel) min: 0.12% max: 7.14% x̄: 0.77% x̃: 0.70% 95% mean confidence interval for instructions value: 1.99 2.10 95% mean confidence interval for instructions %-change: 0.74% 0.79% Instructions are HURT. total cycles in shared programs: 238280994 -> 238294376 (<.01%) cycles in affected programs: 8841250 -> 8854632 (0.15%) helped: 84 HURT: 1192 helped stats (abs) min: 4 max: 64 x̄: 12.50 x̃: 8 helped stats (rel) min: 0.02% max: 1.61% x̄: 0.28% x̃: 0.17% HURT stats (abs) min: 2 max: 198 x̄: 12.11 x̃: 12 HURT stats (rel) min: 0.02% max: 8.03% x̄: 0.28% x̃: 0.14% 95% mean confidence interval for cycles value: 9.65 11.32 95% mean confidence interval for cycles %-change: 0.22% 0.27% Cycles are HURT. No fossil-db changes on any Intel platform. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	4246c2869c	nir/algebraic: Invert comparisons less often This fixes the piglit test range_analysis_fsat_of_nan.shader_test. That test contains some code like o = saturate(X) > 0 ? vec4(1.0, 0.0, 0.0, 1.0) : vec4(0.0, 1.0, 0.0, 1.0); A clever optimizer will convert this to o = vec4(float(saturate(X) > 0), float(!(saturate(X) > 0)), 0, 1); Due to the ordering of optimizations in the compiler, the `saturate` operations are removed. This is safe even in the presense of NaN. o = vec4(float(X > 0), float(!(X > 0)), 0, 1); Since the calculations are not marked precise, an overzealous optimizer may reduce this to o = vec4(float(X > 0), float(X <= 0), 0, 1); This will result in black being output. The GLSL spec gives quite a bit of leeway with respect to NaN, but that seems too far. The shader author asked for a result of red or green. A result of black is still "undefined behavior," but it's also a little mean. This also enables CSE to do its job better. v2: Update A530 expected image checksum for minetest.trace. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4531 Fixes: `0dbda153aa` ("nir/algebraic: Flag inexact optimizations") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake total instructions in shared programs: 21041563 -> 21041789 (<.01%) instructions in affected programs: 992066 -> 992292 (0.02%) helped: 526 HURT: 548 helped stats (abs) min: 1 max: 16 x̄: 2.48 x̃: 2 helped stats (rel) min: 0.04% max: 5.56% x̄: 0.74% x̃: 0.49% HURT stats (abs) min: 1 max: 27 x̄: 2.80 x̃: 2 HURT stats (rel) min: 0.04% max: 4.55% x̄: 0.59% x̃: 0.38% 95% mean confidence interval for instructions value: -0.00 0.42 95% mean confidence interval for instructions %-change: -0.12% <.01% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 855885569 -> 856118189 (0.03%) cycles in affected programs: 343637248 -> 343869868 (0.07%) helped: 907 HURT: 541 helped stats (abs) min: 1 max: 7724 x̄: 206.45 x̃: 36 helped stats (rel) min: <.01% max: 29.97% x̄: 1.01% x̃: 0.37% HURT stats (abs) min: 1 max: 14177 x̄: 776.09 x̃: 31 HURT stats (rel) min: <.01% max: 29.94% x̄: 1.24% x̃: 0.35% 95% mean confidence interval for cycles value: 84.30 237.00 95% mean confidence interval for cycles %-change: -0.32% -0.01% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). LOST: 3 GAINED: 5 Ice Lake total instructions in shared programs: 20027107 -> 20025352 (<.01%) instructions in affected programs: 1068856 -> 1067101 (-0.16%) helped: 1153 HURT: 273 helped stats (abs) min: 1 max: 14 x̄: 1.83 x̃: 1 helped stats (rel) min: 0.03% max: 5.66% x̄: 0.61% x̃: 0.35% HURT stats (abs) min: 1 max: 15 x̄: 1.29 x̃: 1 HURT stats (rel) min: 0.16% max: 1.30% x̄: 0.58% x̃: 0.60% 95% mean confidence interval for instructions value: -1.33 -1.13 95% mean confidence interval for instructions %-change: -0.43% -0.34% Instructions are helped. total cycles in shared programs: 979499227 -> 979448725 (<.01%) cycles in affected programs: 344261539 -> 344211037 (-0.01%) helped: 1079 HURT: 441 helped stats (abs) min: 1 max: 9384 x̄: 147.78 x̃: 48 helped stats (rel) min: <.01% max: 31.83% x̄: 0.90% x̃: 0.33% HURT stats (abs) min: 1 max: 7220 x̄: 247.07 x̃: 32 HURT stats (rel) min: <.01% max: 31.30% x̄: 1.52% x̃: 0.53% 95% mean confidence interval for cycles value: -70.01 3.56 95% mean confidence interval for cycles %-change: -0.35% -0.05% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 10564 -> 10568 (0.04%) spills in affected programs: 143 -> 147 (2.80%) helped: 0 HURT: 1 total fills in shared programs: 11343 -> 11347 (0.04%) fills in affected programs: 287 -> 291 (1.39%) helped: 0 HURT: 1 LOST: 3 GAINED: 2 Skylake total instructions in shared programs: 18192274 -> 18190128 (-0.01%) instructions in affected programs: 1000188 -> 998042 (-0.21%) helped: 1149 HURT: 55 helped stats (abs) min: 1 max: 14 x̄: 1.92 x̃: 1 helped stats (rel) min: 0.04% max: 6.67% x̄: 0.67% x̃: 0.42% HURT stats (abs) min: 1 max: 2 x̄: 1.05 x̃: 1 HURT stats (rel) min: 0.16% max: 0.55% x̄: 0.27% x̃: 0.26% 95% mean confidence interval for instructions value: -1.87 -1.69 95% mean confidence interval for instructions %-change: -0.67% -0.58% Instructions are helped. total cycles in shared programs: 960856054 -> 960728040 (-0.01%) cycles in affected programs: 340840968 -> 340712954 (-0.04%) helped: 1079 HURT: 233 helped stats (abs) min: 1 max: 7640 x̄: 170.95 x̃: 46 helped stats (rel) min: <.01% max: 30.20% x̄: 0.96% x̃: 0.28% HURT stats (abs) min: 1 max: 6864 x̄: 242.23 x̃: 26 HURT stats (rel) min: <.01% max: 34.64% x̄: 2.10% x̃: 0.22% 95% mean confidence interval for cycles value: -135.62 -59.53 95% mean confidence interval for cycles %-change: -0.59% -0.25% Cycles are helped. LOST: 15 GAINED: 1 Broadwell total instructions in shared programs: 17855624 -> 17853580 (-0.01%) instructions in affected programs: 1012209 -> 1010165 (-0.20%) helped: 1105 HURT: 52 helped stats (abs) min: 1 max: 13 x̄: 1.90 x̃: 1 helped stats (rel) min: 0.03% max: 6.67% x̄: 0.67% x̃: 0.36% HURT stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.13% max: 0.52% x̄: 0.26% x̃: 0.25% 95% mean confidence interval for instructions value: -1.86 -1.67 95% mean confidence interval for instructions %-change: -0.68% -0.58% Instructions are helped. total cycles in shared programs: 1029905447 -> 1029840699 (<.01%) cycles in affected programs: 347102680 -> 347037932 (-0.02%) helped: 1007 HURT: 211 helped stats (abs) min: 1 max: 1360 x̄: 89.76 x̃: 48 helped stats (rel) min: <.01% max: 16.26% x̄: 0.69% x̃: 0.25% HURT stats (abs) min: 1 max: 1297 x̄: 121.51 x̃: 20 HURT stats (rel) min: <.01% max: 31.31% x̄: 1.21% x̃: 0.20% 95% mean confidence interval for cycles value: -62.39 -43.92 95% mean confidence interval for cycles %-change: -0.47% -0.25% Cycles are helped. total spills in shared programs: 20335 -> 20333 (<.01%) spills in affected programs: 19 -> 17 (-10.53%) helped: 2 HURT: 0 total fills in shared programs: 25905 -> 25899 (-0.02%) fills in affected programs: 23 -> 17 (-26.09%) helped: 2 HURT: 0 LOST: 9 GAINED: 0 Haswell total instructions in shared programs: 16418516 -> 16417293 (<.01%) instructions in affected programs: 223785 -> 222562 (-0.55%) helped: 590 HURT: 67 helped stats (abs) min: 1 max: 15 x̄: 2.19 x̃: 1 helped stats (rel) min: 0.03% max: 6.52% x̄: 0.87% x̃: 0.60% HURT stats (abs) min: 1 max: 2 x̄: 1.04 x̃: 1 HURT stats (rel) min: 0.04% max: 1.85% x̄: 0.44% x̃: 0.25% 95% mean confidence interval for instructions value: -2.01 -1.71 95% mean confidence interval for instructions %-change: -0.80% -0.67% Instructions are helped. total cycles in shared programs: 1037179754 -> 1037084874 (<.01%) cycles in affected programs: 352541071 -> 352446191 (-0.03%) helped: 1093 HURT: 182 helped stats (abs) min: 1 max: 888 x̄: 111.03 x̃: 64 helped stats (rel) min: <.01% max: 27.30% x̄: 0.84% x̃: 0.20% HURT stats (abs) min: 1 max: 6777 x̄: 145.49 x̃: 21 HURT stats (rel) min: <.01% max: 24.10% x̄: 1.99% x̃: 0.29% 95% mean confidence interval for cycles value: -88.10 -60.73 95% mean confidence interval for cycles %-change: -0.58% -0.29% Cycles are helped. total spills in shared programs: 17457 -> 17456 (<.01%) spills in affected programs: 12 -> 11 (-8.33%) helped: 1 HURT: 0 total fills in shared programs: 20387 -> 20385 (<.01%) fills in affected programs: 15 -> 13 (-13.33%) helped: 1 HURT: 0 LOST: 6 GAINED: 1 Ivy Bridge and earlier platforms had similar results. (Ivy Bridge shown) total instructions in shared programs: 15515482 -> 15513998 (<.01%) instructions in affected programs: 239739 -> 238255 (-0.62%) helped: 573 HURT: 57 helped stats (abs) min: 1 max: 20 x̄: 2.73 x̃: 2 helped stats (rel) min: 0.03% max: 9.84% x̄: 0.94% x̃: 0.55% HURT stats (abs) min: 1 max: 2 x̄: 1.39 x̃: 1 HURT stats (rel) min: 0.09% max: 1.85% x̄: 0.52% x̃: 0.35% 95% mean confidence interval for instructions value: -2.57 -2.14 95% mean confidence interval for instructions %-change: -0.89% -0.73% Instructions are helped. total cycles in shared programs: 584509880 -> 584463152 (<.01%) cycles in affected programs: 11765280 -> 11718552 (-0.40%) helped: 661 HURT: 152 helped stats (abs) min: 1 max: 3073 x̄: 101.99 x̃: 32 helped stats (rel) min: <.01% max: 34.38% x̄: 1.46% x̃: 0.50% HURT stats (abs) min: 1 max: 6637 x̄: 136.10 x̃: 15 HURT stats (rel) min: <.01% max: 24.19% x̄: 1.75% x̃: 0.25% 95% mean confidence interval for cycles value: -82.79 -32.16 95% mean confidence interval for cycles %-change: -1.11% -0.61% Cycles are helped. LOST: 9 GAINED: 0 Tiger Lake Instructions in all programs: 160905127 -> 160900949 (-0.0%) SENDs in all programs: 6812418 -> 6812085 (-0.0%) Loops in all programs: 38225 -> 38225 (+0.0%) Cycles in all programs: 7431911114 -> 7433914697 (+0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304539 -> 304537 (-0.0%) Ice Lake Instructions in all programs: 145296733 -> 145292370 (-0.0%) SENDs in all programs: 6863818 -> 6863485 (-0.0%) Loops in all programs: 38219 -> 38219 (+0.0%) Cycles in all programs: 8798257570 -> 8800204360 (+0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334250 -> 334248 (-0.0%) Skylake Instructions in all programs: 135891485 -> 135887357 (-0.0%) SENDs in all programs: 6803031 -> 6802698 (-0.0%) Loops in all programs: 38216 -> 38216 (+0.0%) Cycles in all programs: 8442221881 -> 8444201959 (+0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301116 -> 301114 (-0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	49177b9e2f	nir/algebraic: Tautology replacements require sources be numbers It seems worth the small amount of damage to give an extra cushion of not having to debug problems later. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21043197 -> 21043359 (<.01%) instructions in affected programs: 4409 -> 4571 (3.67%) helped: 0 HURT: 25 HURT stats (abs) min: 1 max: 16 x̄: 6.48 x̃: 5 HURT stats (rel) min: 0.39% max: 15.38% x̄: 4.59% x̃: 4.40% 95% mean confidence interval for instructions value: 4.37 8.59 95% mean confidence interval for instructions %-change: 2.93% 6.26% Instructions are HURT. total cycles in shared programs: 856175986 -> 856176921 (<.01%) cycles in affected programs: 58908 -> 59843 (1.59%) helped: 0 HURT: 25 HURT stats (abs) min: 7 max: 70 x̄: 37.40 x̃: 38 HURT stats (rel) min: 0.27% max: 5.63% x̄: 1.87% x̃: 1.39% 95% mean confidence interval for cycles value: 31.11 43.69 95% mean confidence interval for cycles %-change: 1.35% 2.39% Cycles are HURT. No fossil-db changes on any Intel platform. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Ian Romanick	d69ba58644	nir/algebraic: Remove some optimizations of comparisons with fsat When most of these patterns were created, we believed, incorrectly, that fsat(NaN) was NaN. We have since realized that fsat(NaN) is zero. Originally, this changed the patterns to use is_a_number. This didn't help any shaders, so it's easier to just drop the optimizations. This commit crossed paths with `4c3ad4d065` ("nir/algebraic: mark more optimization with fsat(NaN) as inexact") and `bc123c396a` ("nir/algebraic: mark some optimizations with fsat(NaN) as inexact"). Given that these don't impact very many shaders, it seems safer to just remove them. As discussed in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8716, I tried modifying these patterns to use !(b cmp a). Unfortunately, on Intel GPUs, the results were much worse than just removing the patterns altogether. Some other related patterns will be addressed in later commits. There are still a number of patterns that use the identity fsat(1-X) == 1 - fsat(X). If X is NaN, the former is zero while the latter is 1.0. I haven't evaluted these patterns yet. If changes are needed in these patterns, it should be a separate commit anyway. v2: Replace arrow `=>` with `->` in comments because the `=>` looks a lot like `<=` comparison. Suggested by Rhys. Fixes: `92b75c126b` ("nir/algebraic: Replace checks that a value is between (or not) [0, 1]") Fixes: `a7f0c57673` ("nir/algebraic: Eliminate useless fsat() on operand of comparison w/value in (0, 1)") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel hardware had similar results. (Ice Lake shown) total instructions in shared programs: 20029060 -> 20029670 (<.01%) instructions in affected programs: 69236 -> 69846 (0.88%) helped: 0 HURT: 263 HURT stats (abs) min: 1 max: 20 x̄: 2.32 x̃: 1 HURT stats (rel) min: 0.30% max: 11.11% x̄: 1.35% x̃: 0.98% 95% mean confidence interval for instructions value: 1.86 2.78 95% mean confidence interval for instructions %-change: 1.18% 1.52% Instructions are HURT. total cycles in shared programs: 979821278 -> 979834425 (<.01%) cycles in affected programs: 1476848 -> 1489995 (0.89%) helped: 49 HURT: 204 helped stats (abs) min: 1 max: 812 x̄: 102.31 x̃: 20 helped stats (rel) min: 0.01% max: 21.43% x̄: 2.23% x̃: 0.52% HURT stats (abs) min: 2 max: 2600 x̄: 89.02 x̃: 16 HURT stats (rel) min: 0.04% max: 27.27% x̄: 1.49% x̃: 0.72% 95% mean confidence interval for cycles value: 13.18 90.75 95% mean confidence interval for cycles %-change: 0.29% 1.25% Cycles are HURT. No fossil-db changes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10012>	2021-05-20 01:39:35 +00:00
Jason Ekstrand	b447f5049b	nir: Add a discard optimization pass Many fragment shaders do a discard using relatively little information but still put the discard fairly far down in the shader for no good reason. If the discard is moved higher up, we can possibly avoid doing some or almost all of the work in the shader. When this lets us skip texturing operations, it's an especially high win. One of the biggest offenders here is DXVK. The D3D APIs have different rules for discards than OpenGL and Vulkan. One effective way (which is what DXVK uses) to implement DX behavior on top of GL or Vulkan is to wait until the very end of the shader to discard. This ends up in the pessimal case where we always do all of the work before discarding. This pass helps some DXVK shaders significantly. v2 (Jason Ekstrand): - Fix a couple of typos (Grazvydas, Ian) - Use the new nir_instr_move helper - Find all movable discards before moving anything so we don't accidentally re-order anything and break dependencies v3 (Pierre-Eric): remove the call to nir_opt_conditional_discard based on Daniel Schürmann comment. v4 (Pierre-Eric): - handle demote intrinsics and drop derivatives_safe_after_discard - add early return if discards/demotes aren't used v5 (Pierre-Eric): - use pass_flags instead of instr set (Daniel Schürmann) v6 (Daniel Schürmann): - cleanup and fix pass_flags handling Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10522>	2021-05-19 18:04:44 +00:00
Jason Ekstrand	3033410b10	nir/gather_info: Expose a nir_intrinsic_writes_external_memory helper Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10522>	2021-05-19 18:04:44 +00:00
Jason Ekstrand	f97fb1fa55	nir: Add a nir_instr_move helper Removes an instruction from one place and inserts it at another while working around a weird cursor corner-case. v2: change return value to bool (Daniel Schürmann) Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> (v1) Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10522>	2021-05-19 18:04:44 +00:00
Bas Nieuwenhuizen	2d6a6469b8	nir: Add bvh64_intersect_ray_amd intrinsic. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10818>	2021-05-18 23:01:47 +02:00
Bas Nieuwenhuizen	aa82f91c38	nir: Add load_sbt_amd intrinsic. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9767>	2021-05-18 18:29:36 +00:00
Samuel Pitoiset	1b1c726ca9	nir/opt_access: fix getting variables in presence of similar bindings/desc It's perfectly legal to declare multiple SSBOs that point to the same binding/descriptor_set with different access mask. Currently, it will always get the first one in the list that matches binding/desc_set regardless of the access mask, but other variables might have different access mask. Fix this by being conservative if another variable uses the same binding/desc_set because we can't get it reliably without adding a new field to vulkan_resource_index. This fixes rendering issues in Resident Evil Village with vkd3d-proton. This bug has been uncovered by ("spirv: Don't remove variables used by resource indexing intrinsics") because variables are no longer removed No fossils-db changes. Cc: 21.1 mesa-stable Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10692>	2021-05-18 06:25:24 +00:00
Connor Abbott	a40714abf7	nir/lower_phis_to_scalar: Add "lower_all" option We don't want to have to deal with vector phis in freedreno, because vectors are always split/unsplit around vectorized instructions anyways, and the stated reason for not scalarising them (it hurting coalescing) won't apply to us because we won't be using nir_from_ssa. Add this option so that we don't have to do the equivalent thing while translating from NIR. Reviewed-by: Rob Clark <robdclark@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10809>	2021-05-17 09:59:45 +00:00
Mike Blumenkrantz	6df187df13	nir/builder: add nir_pad_vector and nir_pad_vec4 util functions these pad a given value to vec4 or arbitrary number of components Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10630>	2021-05-16 14:15:14 +00:00
Gert Wollny	4c045ad11e	nir/linker: add option to ignore the IO precisions for better varying packing Backends that don't handle IO component precision can pack more varyings into one slot if the linker ignores the precision. If the IO is vectorized then this can save IO instructions. Related: `165a69d2f7` nir: handle mediump varyings in varying compaction helpers Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10722>	2021-05-15 09:58:27 +02:00
Caio Marcelo de Oliveira Filho	09984fd02f	nir: Rename nir_is_per_vertex_io to nir_is_arrayed_io VS outputs are "per vertex" but not the kind of I/O we want to match with this helper. Change to a name that covers the "arrayness" required by the type. Name inspired by the GLSL spec definition of arrayed I/O. Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10493>	2021-05-14 16:17:45 +00:00
Gert Wollny	e418710f8b	compiler/nir: check whether var is an input in lower_fragcoord_wtrans Otherwise the lowering pass might try to lower any other load from a deref if its data.location value happens to be zero. Fixes: `418c4c0d7d` compiler/nir: extend lower_fragcoord_wtrans to support VARYING_SLOT_POS Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10577>	2021-05-14 13:26:13 +00:00
Timothy Arceri	5aabc91273	glsl: add missing support for explicit components in interface blocks From the ARB_enhanced_layouts spec: "As with input layout qualifiers, all shaders except compute shaders allow location layout qualifiers on output variable declarations, output block declarations, and output block member declarations. Of these, variables and block members (but not blocks) additionally allow the component layout qualifier." We previously had compile tests in piglit to make sure this was not a compile error but no execution tests. Fixes: `d99a040bbf` ("i965: enable ARB_enhanced_layouts for gen8+") Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10763>	2021-05-13 08:07:53 +00:00
Timothy Arceri	1a71d6aa6e	glsl: create validate_component_layout_for_type() helper This will be used in the following patch. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10763>	2021-05-13 08:07:53 +00:00
Timur Kristóf	0d6b6c850f	nir: Add AMD specific intrinsics for merged shaders and NGG. These intrinsics represent what the hardware can actually do. Lowering our shaders to use these intrinsics will allow us to deal with mapping the classic VS, TES, GS (and the future MS) stages to the hardware capabilities using NIR, which makes our backend compilers simpler. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Timur Kristóf	641707a807	nir: Allow load_primitive_id in VS in nir_divergence_analysis. The lowered NIR code of NGG VS shaders uses this intrinsic when the VS has to export the primitive ID. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Timur Kristóf	e905e0938a	nir: Support upper bound of unsigned bit size conversions. These allow us to generate slightly better code in some cases, eg. multiplications in ACO. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Timur Kristóf	9a2ffe1abb	nir: Support upper bound of subgroup_id/num_subgroups for non-compute. These intrinsics will be used when lowering NGG shaders, including currently supported stages like VS, TES, GS and also by mesh shaders in the future. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10740>	2021-05-12 13:47:04 +00:00
Marcin Ślusarz	2c3e2d69bd	nir: handle float atomics in nir_lower_memory_model Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Fixes: `2adb337256` ("nir,radv/aco: add and use pass to lower make available/visible barriers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10766>	2021-05-12 11:09:07 +00:00
Marcin Ślusarz	27073b59bc	nir: handle float atomics in nir_gather_info Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10766>	2021-05-12 11:09:07 +00:00
Tapani Pälli	181beece3c	nir: skip assert check with empty structs Fixes issues with upcoming CTS test testing empty structs. v2: decorate with UNUSED as only used in assert (Timothy) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10681>	2021-05-10 08:07:29 +03:00
Mauro Rossi	2736ae0454	android: nir: add nir_lower_fragcolor.c to Makefile.sources Fixes the following building error: FAILED: out/target/product/x86_64/obj/SHARED_LIBRARIES/gallium_dri_intermediates/LINKED/gallium_dri.so ... ld.lld: error: undefined symbol: nir_lower_fragcolor >>> referenced by pan_assemble.c:81 (external/mesa/src/gallium/drivers/panfrost/pan_assemble.c:81) Cc: 21.0 21.1 <mesa-stable@lists.freedesktop.org> Fixes: `1fd3563025` ("nir: add lowering pass for fragcolor -> fragdata") Acked-by: Rob Clark <robclark@freedesktop.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10712>	2021-05-09 00:34:46 +02:00
Alyssa Rosenzweig	db2f6b87a3	nir/divergence_anlysis: Add intrinsics for Bifrost Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10022>	2021-05-07 18:20:30 +00:00
Alyssa Rosenzweig	f3de2bd6c2	nir: Add blend lowering pass This pass was originally developed for Panfrost, where it passes the relevant dEQP tests. Upstreaming so it can be extended and then shared with: * Asahi, for blending * Zink, for logic ops * Lavapipe, for advanced blending Note that using this with MRT in a fragment shader (as non-panfrost drivers will) has not yet been tested. Logic ops with integer framebuffers are probably todo. It's been enough for Panfrost, will suffice for ES2 on Asahi, and provides an upstream base for kusma's work on advanced blending, so overall the merge is a net benefit. v2: Remove bogus assert that the format layout is PLAIN. We need to render R11G11B10, which Mesa reports as layout OTHER. The code is still correct. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> [v1] Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10601>	2021-05-07 17:25:21 +00:00
Gert Wollny	b4600d9352	nir: Add filter callback for lower_to_scalar to the options Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9943>	2021-05-07 12:09:03 +00:00
Mike Blumenkrantz	37545418cd	nir: add nir_isub_imm Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10654>	2021-05-06 13:01:03 +00:00
Jesse Natalie	d8bac1002c	vtn: Use relaxed 24bit opcodes for CL 24bit math Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10549>	2021-05-05 22:06:42 +00:00
Jesse Natalie	d7ca0319d7	nir: Add relaxed 24bit opcodes These are equivalent to the 32bit opcodes if there are no more efficient 24bit opcodes available, but inputs are guaranteed to already be 24bit, so the 24bit opcodes can be used instead if they exist and are efficient. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10549>	2021-05-05 22:06:42 +00:00
Jason Ekstrand	e1edf74dde	nir/builder: Move clamp helpers to nir_builder.h Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10631>	2021-05-04 22:51:34 +00:00
Caio Marcelo de Oliveira Filho	dd48683cfd	nir: Move shared_memory_explicit_layout bit into common shader_info Move it out of the "cs" sub-struct, since the bit can be used for other shader stages in the future. This also removes a subtle issue in spirv_to_nir: info.cs.shared_memory_explicit_layout was used without checking for the CS shader stage. It ended up being "harmless" since the effects also depended on presence of shared variables. Fixes: `5de6c5973a` ("spirv: Implement SPV_KHR_workgroup_memory_explicit_layout") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10529>	2021-05-04 20:54:58 +00:00
Iago Toral Quiroga	aebb47b7d1	compiler/nir: add a divergence analysis option for non-uniform workgroup id The V3D hardware allows us to pack multiple workgroups together to avoid wasting execution lanes in shader cores. For example, if we dispatch 16 workgroups with a local size of 1 element, we can pack all 16 workgroups in a single 16-wide dispatch where each lane executes a different workgroup, instead of 16 1-wide dispatches. When we do this, we don't have a uniform workgroup id any more. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10541>	2021-05-04 15:53:23 +00:00
Caio Marcelo de Oliveira Filho	7cc846788c	nir: Remove now unnecessary conditions from emit_load/store helpers The mode one was used before `0bc5a829dd` ("nir: Remove shared support from lower_io"). The others were used before `5f7c7c9a7f` ("nir: add src and dest types to all IO loads and stores for mediump"). All conditions now are always true, so drop them. Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10533>	2021-05-04 06:33:24 -07:00
Gert Wollny	a199697642	nir/opt_algebraic: optimizations for add umax/umin with zero For unsigned comparisons with zero these ops can be eliminated. v2: Add comparison optimizations with -1 (Rhys Perry) Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Eric Anholt <eric@anholt.net> (v1) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10583>	2021-05-04 09:33:32 +02:00
Alyssa Rosenzweig	a976101da5	nir/opcodes: Reword confusing comment Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10578>	2021-05-03 12:51:47 +00:00
Alyssa Rosenzweig	0ea67e57e5	nir: Add fsin_agx opcode Used to split up the fsin/fcos lowering for AGX between NIR and the backend, to permit algebraic optimizations without polluting NIR with too many hardware details. The backend NIR lowering produces an fmul/ffma of the input so we can optimize code like sin(2*x). Signed-off-by: Alyssa Rosenzweig <alyssa@rosenzweig.io> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10582>	2021-05-02 17:41:09 -04:00
Caio Marcelo de Oliveira Filho	e763db4a47	spirv: Don't replicate patch bool in vtn_variable When we originally added patch variable handling to spirv_to_nir, we were splitting I/O block variables in spirv_to_nir, so we weren't guaranteed to have a nir_variable early enough in processing. Since `b0c643d8f5` ("spirv: Use NIR per-member splitting"), we've been using NIR per-member splitting where we have a nir_variable which has a separate nir_variable_data per member. With this, we can drop vtn_variable::patch and use the patch boolean on the nir_variable instead. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10469>	2021-04-29 06:55:29 +00:00
Rhys Perry	7a7838529a	nir/lower_non_uniform: allow lowering with vec2 handles Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9523>	2021-04-27 15:56:07 +00:00
Tapani Pälli	d93153a564	glsl: ignore interface precision qualifier on desktop GL This fixes linking failures with new GL45 linkage tests, no regressions spotted on existing tests. v2: add spec reference (Samuel) Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10373>	2021-04-27 08:25:41 +00:00
Mike Blumenkrantz	c8dfed0c12	nir/gl_lower_buffers: set access for ssbo load/store instrs this is the last place where the information is available, so set the info before it gets lost Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10074>	2021-04-26 21:31:44 +00:00
Connor Abbott	77fcb01f7f	nir/lower_clip_disable: Fix store writemask We're storing into the array element, not the whole variable. Fixes: `fb2fe80` ("nir: add lowering pass for clip plane enabling") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7274>	2021-04-26 17:07:02 +00:00
Jesse Natalie	2775b9139b	nir_lower_readonly_images_to_tex: Use nir_shader_lower_instructions Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>	2021-04-23 23:16:15 +00:00
Jesse Natalie	fa677c8644	nir_lower_readonly_images_to_tex: Support non-CL semantics For non-CL, intrinsic access isn't set, because the image type doesn't have access qualifier. Instead, the access qualifier is set on the variable. So, add a mode to this pass which can chase back to the variable in addition to the intrinsic access. Also, update the variable type and the deref chain types so everything is consistent, that the tex is accessing a sampler. Note we can't do this for CL, because void-typed samplers don't exist. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>	2021-04-23 23:16:15 +00:00
Jesse Natalie	29c9731400	nir: Rename nir_lower_cl_images_to_tex, replace 'cl' with 'readonly' Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>	2021-04-23 23:16:15 +00:00
Jesse Natalie	1c41f63e26	vtn: Propagate access data from UBO/SSBO/push constant types to variables of that type, not just their pointers Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>	2021-04-23 23:16:15 +00:00
Jesse Natalie	9936463ef6	vtn: Propagate access data that's present on all struct members to the struct itself Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10356>	2021-04-23 23:16:15 +00:00
Alyssa Rosenzweig	c84804f167	nir/lower_fragcolor: Take max cbufs as argument One step closer to generalizing this pass to more drivers. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10411>	2021-04-23 17:20:43 +00:00
Alyssa Rosenzweig	73eb497b86	nir/lower_fragcolor: Fix driver_location assignment Fixes crash in dEQP-GLES31.functional.shaders.framebuffer_fetch.basic.last_frag_data when using this pass. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10411>	2021-04-23 17:20:43 +00:00
Alyssa Rosenzweig	0f4ba349e9	nir/lower_fragcolor: Handle fp16 outputs Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10391>	2021-04-21 22:17:28 +00:00
Alyssa Rosenzweig	49c6157b15	nir/lower_fragcolor: Use shader_instructions_pass While I was in the area. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10391>	2021-04-21 22:17:28 +00:00
Lionel Landwerlin	0bb29c07a4	spirv: fixup pointer_to/from_ssa with acceleration structures Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `ed907e5d84` ("spirv: Add support for OpTypeAccelerationStructureKHR") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10357>	2021-04-21 21:51:51 +00:00
Rhys Perry	89b759c4f9	nir/opt_load_store_vectorize: loop internally To vectorize to vec8/16 or vec4 (without vec3), we can't incrementally add components to a load/store. This patch loops vectorization so that two new vec2/4/8 operations can be combined into a larger operation. fossil-db (GFX10.3): Totals from 22 (0.02% of 139391) affected shaders: SpillVGPRs: 1749 -> 1771 (+1.26%) CodeSize: 901212 -> 892532 (-0.96%); split: -1.19%, +0.22% Scratch: 178176 -> 184320 (+3.45%) Instrs: 159358 -> 158027 (-0.84%); split: -0.99%, +0.16% Cycles: 37046772 -> 36738544 (-0.83%); split: -1.00%, +0.17% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>	2021-04-21 20:26:58 +00:00
Rhys Perry	447820d003	nir/opt_load_store_vectorize: ignore load_vulkan_descriptor These mess with alignment calculation. No fossil-db changes. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>	2021-04-21 20:26:58 +00:00
Rhys Perry	6ca11b4a66	nir/opt_load_store_vectorize: improve handling of swizzles Previously (for simplicity), it could have skipped vectorization if swizzles were involved. fossil-db (GFX10.3): Totals from 498 (0.36% of 139391) affected shaders: SGPRs: 25328 -> 26608 (+5.05%); split: -1.36%, +6.41% VGPRs: 9988 -> 9996 (+0.08%) SpillSGPRs: 40 -> 65 (+62.50%) CodeSize: 1410188 -> 1385584 (-1.74%); split: -1.76%, +0.02% Instrs: 257149 -> 250579 (-2.55%); split: -2.57%, +0.01% Cycles: 1096892 -> 1070600 (-2.40%); split: -2.41%, +0.01% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>	2021-04-21 20:26:58 +00:00
Rhys Perry	4df3654c79	nir/load_store_vectorize: assume CAN_REORDER ops don't alias with stores fossil-db (GFX10.3): Totals from 20 (0.01% of 139391) affected shaders: SGPRs: 688 -> 712 (+3.49%); split: -1.16%, +4.65% CodeSize: 35488 -> 34424 (-3.00%); split: -3.04%, +0.05% Instrs: 6405 -> 6259 (-2.28%); split: -2.44%, +0.16% Cycles: 51768 -> 51268 (-0.97%); split: -1.21%, +0.24% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10384>	2021-04-21 20:26:58 +00:00
Mike Blumenkrantz	3ccd0891d3	nir/lower_fragcolor: set outputs_written for fragdata members normal gather_info stuff Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10080>	2021-04-21 19:36:16 +00:00
Matt Turner	7f8c5844ef	compiler/glsl: Always propagate_invariance() last Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10292>	2021-04-20 18:54:57 +00:00
Matt Turner	d35f8604c7	compiler/glsl: Propagate invariant/precise when splitting arrays This fixes the dEQP-GLES3.functional.shaders.invariance.{low,medium,high}p.loop_4 tests when run in a VM with virgl on a host with iris. virgl mangles the GLSL shaders and emits shader code for the host driver that contains vec4 arrays. As such, the test did not fail when running directly on the host. The test also did not fail if the host was using i965. Disabling PIPE_CAP_GLSL_OPTIMIZE_CONSERVATIVELY in iris was sufficient to work around it, so I believe that i965 didn't show the problem because after arrays were split by optimize_split_arrays(), even though the invariant/precise qualifiers were lost, do_common_optimization() would be called again and thus propagate_invariance() would propagate the qualifiers to the new variables produced by optimize_split_arrays(). Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10292>	2021-04-20 18:54:57 +00:00
Matt Turner	5ef4296cb6	compiler/glsl: Return progress from propagate_invariance() Doing so allow you to easily tell what the pass did using the existing infrastructure in the OPT macro. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10292>	2021-04-20 18:54:57 +00:00
Jesse Natalie	0e2566a8a7	shader_enums: Fix MSVC warning C4334 (32bit shift cast to 64bit) The warning is triggered when assigning into inputs_read, which is 64bit. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-By: Bill Kristiansen <billkris@microsoft.com> Cc: mesa-stable@lists.freedesktop.org Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10331>	2021-04-20 00:28:35 +00:00
Jesse Natalie	09440ce3fb	nir: Fix MSVC warning C4334 (32bit shift cast to 64bit) Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-By: Bill Kristiansen <billkris@microsoft.com> Cc: mesa-stable@lists.freedesktop.org Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10331>	2021-04-20 00:28:34 +00:00
Lionel Landwerlin	856953b131	spirv: fix uToAccelerationStructure handling Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `7f223a2329` ("spirv: Implement SpvOpConvertUToAccelerationStructureKHR") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10324>	2021-04-19 22:02:53 +00:00
Alyssa Rosenzweig	899dd8e60a	nir: Update some comments referring to imov This was renamed when I was in high school. I remember updating the Midgard compiler while sitting in AP Physics. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10296>	2021-04-19 20:07:35 +00:00
Danylo Piliaiev	f17b41ab4f	nir: add lowering pass for helperInvocationEXT() Some hardware doesn't have a way to check if invocation was demoted, in such case we have to track it ourselves. OpIsHelperInvocationEXT is specified as: "An invocation is currently a helper invocation if it was originally invoked as a helper invocation or if it has been demoted to a helper invocation by OpDemoteToHelperInvocationEXT." Therefore we: - Set gl_IsHelperInvocationEXT = gl_HelperInvocation - Add "gl_IsHelperInvocationEXT = true" right before each demote - Add "gl_IsHelperInvocationEXT = gl_IsHelperInvocationEXT \|\| condition" right before each demote_if Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>	2021-04-19 17:11:36 +00:00
Erik Faye-Lund	7886983835	nir/lower_tex: do not stumble on 16-bit inputs If a has been lowered to float16 here, then we end up trying to construct a vector of mixed precision, which the validator asserts about. So let's make sure we use the same type for all arguments. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10201>	2021-04-19 14:28:05 +00:00

1 2 3 4 5 ...

6226 Commits