KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Jason Ekstrand	4fa58d27a5	intel/fs,vec4: Drop support for shader time Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14056>	2021-12-10 21:20:47 +00:00
Vadym Shovkoplias	2dbb66997e	intel/fs: Fix a cmod prop bug when cmod is set to inst that doesn't support it Fixes dEQP-VK.reconvergence.nesting tests. There are cases when cmod is set to an instruction that cannot have conditional modifier. E.g. following: find_live_channel(32) vgrf166:UD, NoMask cmp.z.f0.0(32) null:D, vgrf166+0.0<0>:D, 0d is optimized to: find_live_channel.z.f0.0(32) vgrf166:UD, NoMask v2: - Add unit test to check cmod is not set to 'find_live_channel' (Matt Turner) - Update flag_subreg when conditonal_mod is updated (Ian Romanick) Signed-off-by: Vadym Shovkoplias <vadym.shovkoplias@globallogic.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5431 Fixes: `32b7ba66b0` ("intel/compiler: fix cmod propagation optimisations") Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13268>	2021-11-01 21:08:12 +00:00
Ian Romanick	a8d0c0af86	intel/fs: Remove type-based restriction for cmod propagation to saturated operations Previously, we misunderstood how conditional modifiers and saturate interacted. We thought the condition was evaulated before the saturate was applied. For the floating point cases, we went to some heroics to modify the condition to maintain the same results. For integer cases, it was not clear that this could even work. We had no use-cases and no tests-cases, so we just disallowed everything. Now we understand that the condition is evaluated after the saturate. Earlier commits in this series removed the various floating point heroics. It is easier to just delete the code that prevents some cases that just work. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	5ad88fd499	intel/fs: Remove after parameter from test_saturate_prop Originally this was part of "intel/fs: Remove condition-based restriction for cmod propagation to saturated operations". With some additional changes to that commit, it caused a lot of extra churn in the unit tests. I felt that made it harder to see the actual changes in the unit tests, so I split it out. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	e6373923a7	intel/fs: Remove condition-based restriction for cmod propagation to saturated operations I don't know why the float_saturate_l_mov test was #if'ed out, but it passes... so this commit enables it. No shader-db or fossil-db changes. In a previous iteration of this MR, this commit helped ~200 shaders in shader-db. Now all of those same shaders are helped by "intel/fs: cmod propagate from MOV with any condition". All of these shaders come from Mad Max. After initial translation from NIR to assembly, these shader contain patterns like: mul(8) g90<1>F g88<8,8,1>F 0x40400000F /* 3F / ... mov.sat(8) g90<1>F g90<8,8,1>F ... cmp.nz.f0(8) null<1>F g90<8,8,1>F 0 / 0F / An initial pass of cmod propagation converts this to mul(8) g90<1>F g88<8,8,1>F 0x40400000F / 3F / ... mov.sat.XX.f0(8) g90<1>F g90<8,8,1>F Without this commit, XX is G. With this commit, XX is NZ. Saturate propagation moves the saturate: mul.sat(8) g90<1>F g88<8,8,1>F 0x40400000F / 3F / ... mov.XX.f0(8) g90<1>F g90<8,8,1>F Without this commit (but with "intel/fs: cmod propagate from MOV with any condition"), the G gets propagated: mul.sat.g.f0(8) g90<1>F g88<8,8,1>F 0x40400000F / 3F / With this commit (with or without "intel/fs: cmod propagate from MOV with any condition"), the NZ gets propagated: mul.sat.nz.f0(8) g90<1>F g88<8,8,1>F 0x40400000F / 3F */ Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	47f0cdc449	intel/fs: cmod propagate from MOV with any condition There were tests related to propagating conditional modifiers from a MOV to an instruction with a .SAT modifier for a very long time, but they were #if'ed out. There are restrictions later in the function that limit the kinds of MOV instructions that can propagate. This avoids the dangers of type-converting MOVs that may generate flags in different ways. v2: Update the added comment to look more like the existing comment. That makes the small differences between the two cases more obvious. Noticed by Marcin. All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 19827127 -> 19826924 (<.01%) instructions in affected programs: 62024 -> 61821 (-0.33%) helped: 201 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.01 x̃: 1 helped stats (rel) min: 0.13% max: 0.60% x̄: 0.35% x̃: 0.36% 95% mean confidence interval for instructions value: -1.02 -1.00 95% mean confidence interval for instructions %-change: -0.36% -0.34% Instructions are helped. total cycles in shared programs: 954655879 -> 954655356 (<.01%) cycles in affected programs: 1212877 -> 1212354 (-0.04%) helped: 155 HURT: 6 helped stats (abs) min: 1 max: 6 x̄: 3.65 x̃: 4 helped stats (rel) min: <.01% max: 0.17% x̄: 0.07% x̃: 0.07% HURT stats (abs) min: 2 max: 12 x̄: 7.00 x̃: 8 HURT stats (rel) min: 0.04% max: 0.23% x̄: 0.14% x̃: 0.15% 95% mean confidence interval for cycles value: -3.60 -2.90 95% mean confidence interval for cycles %-change: -0.07% -0.05% Cycles are helped. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	b23432c540	intel/fs: Fix a cmod prop bug when the source type of a mov doesn't match the dest type of scan_inst We were previously operating with the mindset "a MOV is just a compare with zero." As a result, we were trying to share as much code between the MOV path and the CMP path as possible. However, MOV instructions can perform type conversions that affect the result of the comparison. There was some code added to better handle this for cases like and(16) g31<1>UD g20<8,8,1>UD g22<8,8,1>UD mov.nz.f0(16) null<1>F g31<8,8,1>D The flaw in these changed special cases is that it allowed things like or(8) dest:D src0:D src1:D mov.nz(8) null:D dest:F Because both destinations were integer types, the propagation was allowed. The source type of the MOV and the destination type of the OR do not match, so type conversion rules have to be accounted for. My solution was to just split the MOV and non-MOV paths with completely separate checks. The "else" path in this commit is basically the old code with the BRW_OPCODE_MOV special case removed. The new MOV code further splits into "destination of scan_inst is float" and "destination of scan_inst is integer" paths. For each case I enumerate the rules that I belive apply. For the integer path, only the "Z or NZ" rules are listed as only NZ is currently allowed (hence the conditional_mod assertion in that path). A later commit relaxes this and adds the rule. The new rules slightly relax one of the previous rules. Previously the sizes of the MOV destination and the MOV source had to be the same. In some cases now the sizes can be different by the following conditions: - Floating point to integer conversion are not allowed. - If the conversion is integer to floating point, the size of the floating point value does not matter as it will not affect the comparison result. - If the conversion is float to float, the size of the destination must be greater than or equal to the size of the source. - If the conversion is integer to integer, the size of the destination must be greater than or equal to the size of the source. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	0797388dc2	intel/fs: Add many cmod propagation tests involving MOV instructions Of particular interest are the tests where the MOV performs a type conversion. If the restriction on conditional modifier for a MOV is ever relaxed, some of these cases must still be disallowed. v2: s/NZ/Z/ in one of the comments. Notice by Marcin. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	3afefb0818	intel/fs: Refactor some cmod propagation tests This will simplify some later changes to these tests. v2: Combine test_positive_saturate_prop and test_negative_saturate_prop into a single function. Suggested by Marcin. Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12045>	2021-08-30 14:00:14 -07:00
Ian Romanick	38807ceeae	intel/fs: sel.cond writes the flags on Gfx4 and Gfx5 On Gfx4 and Gfx5, sel.l (for min) and sel.ge (for max) are implemented using a separte cmpn and sel instruction. This lowering occurs in fs_vistor::lower_minmax which is called very, very late... a long, long time after the first calls to opt_cmod_propagation. As a result, conditional modifiers can be incorrectly propagated across sel.cond on those platforms. No tests were affected by this change, and I find that quite shocking. After just changing flags_written(), all of the atan tests started failing on ILK. That required the change in cmod_propagatin (and the addition of the prop_across_into_sel_gfx5 unit test). Shader-db results for ILK and GM45 are below. I looked at a couple before and after shaders... and every case that I looked at had experienced incorrect cmod propagation. This affected a LOT of apps! Euro Truck Simulator 2, The Talos Principle, Serious Sam 3, Sanctum 2, Gang Beasts, and on and on... :( I discovered this bug while working on a couple new optimization passes. One of the passes attempts to remove condition modifiers that are never used. The pass made no progress except on ILK and GM45. After investigating a couple of the affected shaders, I noticed that the code in those shaders looked wrong... investigation led to this cause. v2: Trivial changes in the unit tests. v3: Fix type in comment in unit tests. Noticed by Jason and Priit. v4: Tweak handling of BRW_OPCODE_SEL special case. Suggested by Jason. Fixes: `df1aec763e` ("i965/fs: Define methods to calculate the flag subset read or written by an fs_inst.") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Dave Airlie <airlied@redhat.com> Iron Lake total instructions in shared programs: 8180493 -> 8181781 (0.02%) instructions in affected programs: 541796 -> 543084 (0.24%) helped: 28 HURT: 1158 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.35% max: 0.86% x̄: 0.53% x̃: 0.50% HURT stats (abs) min: 1 max: 3 x̄: 1.14 x̃: 1 HURT stats (rel) min: 0.12% max: 4.00% x̄: 0.37% x̃: 0.23% 95% mean confidence interval for instructions value: 1.06 1.11 95% mean confidence interval for instructions %-change: 0.31% 0.38% Instructions are HURT. total cycles in shared programs: 239420470 -> 239421690 (<.01%) cycles in affected programs: 2925992 -> 2927212 (0.04%) helped: 49 HURT: 157 helped stats (abs) min: 2 max: 284 x̄: 62.69 x̃: 70 helped stats (rel) min: 0.04% max: 6.20% x̄: 1.68% x̃: 1.96% HURT stats (abs) min: 2 max: 48 x̄: 27.34 x̃: 24 HURT stats (rel) min: 0.02% max: 2.91% x̄: 0.31% x̃: 0.20% 95% mean confidence interval for cycles value: -0.80 12.64 95% mean confidence interval for cycles %-change: -0.31% <.01% Inconclusive result (value mean confidence interval includes 0). GM45 total instructions in shared programs: 4985517 -> 4986207 (0.01%) instructions in affected programs: 306935 -> 307625 (0.22%) helped: 14 HURT: 625 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.35% max: 0.82% x̄: 0.52% x̃: 0.49% HURT stats (abs) min: 1 max: 3 x̄: 1.13 x̃: 1 HURT stats (rel) min: 0.12% max: 3.90% x̄: 0.34% x̃: 0.22% 95% mean confidence interval for instructions value: 1.04 1.12 95% mean confidence interval for instructions %-change: 0.29% 0.36% Instructions are HURT. total cycles in shared programs: 153827268 -> 153828052 (<.01%) cycles in affected programs: 1669290 -> 1670074 (0.05%) helped: 24 HURT: 84 helped stats (abs) min: 2 max: 232 x̄: 64.33 x̃: 67 helped stats (rel) min: 0.04% max: 4.62% x̄: 1.60% x̃: 1.94% HURT stats (abs) min: 2 max: 48 x̄: 27.71 x̃: 24 HURT stats (rel) min: 0.02% max: 2.66% x̄: 0.34% x̃: 0.14% 95% mean confidence interval for cycles value: -1.94 16.46 95% mean confidence interval for cycles %-change: -0.29% 0.11% Inconclusive result (value mean confidence interval includes 0). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12191>	2021-08-11 13:09:20 -07:00
Anuj Phogat	61e8636557	intel: Rename gen_device prefix to intel_device export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "gen_device" -rIl $SEARCH_PATH \| xargs sed -ie "s/gen_device/intel_device/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10241>	2021-04-20 20:06:33 +00:00
Anuj Phogat	abe9a71a09	intel: Rename gen field in gen_device_info struct to ver Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "info\)(.\|->)gen" -rIl $SEARCH_PATH \| xargs sed -ie "s/info$)$$\.\\|->$gen/info\1\2ver/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Anuj Phogat	99331f6deb	intel: Rename genx10 field in gen_device_info struct to verx10 Commands used to do the changes: export SEARCH_PATH="src/intel src/gallium/drivers/iris src/mesa/drivers/dri/i965" grep -E "info\)(.\|->)genx10" -rIl $SEARCH_PATH \| xargs sed -ie "s/info$)$$\.\\|->$genx10/info\1\2verx10/g" Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9936>	2021-04-02 18:33:07 +00:00
Caio Marcelo de Oliveira Filho	7fb1e58651	intel/compiler: Make visitors take debug_enabled as a parameter The callers already have this value, and we would like to make it follow different rules other than stage that might not be visible to the helper function, so just pass explicitly. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9779>	2021-03-24 23:18:46 +00:00
Jordan Justen	18bc7d9d3f	intel: Use devinfo genx10 field Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9329>	2021-03-01 22:00:08 -08:00
Rohan Garg	56bbbc8322	intel/compiler: Free resources on test teardown Ensure that all resources are properly released by properly parenting them to a memory context and releasing the context during test teardown. Signed-off-by: Rohan Garg <rohan.garg@collabora.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8162>	2021-02-16 15:07:52 +01:00
Yevhenii Kolesnikov	36abb0c691	intel/compiler: don't propagate cmp to add if add is saturated From the Kaby Lake PRM Vol. 7 "Assigning Conditional Flags": * Note that the [post condition signal] bits generated at the output of a compute are before the .sat. Paragraph about post_zero does not mention saturation, but testing it on actual GPUs shows that conditional modifiers are applied after saturation. * post_zero bit: This bit reflects whether the final result is zero after all the clamping, normalizing, or format conversion logic. For signed types we don't care about saturation: it won't change the result of conditional modifier. For floating and unsigned types there two special cases, when we can remove inst even if scan_inst is saturated: G and LE. Since conditional modifiers are just comparations against zero, saturating positive values to the upper limit never changes the result of comparation. For negative values: (sat(x) > 0) == (x > 0) --- false (sat(x) <= 0) == (x <= 0) --- true Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2610 Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4167>	2020-07-11 00:25:48 +00:00
Yevhenii Kolesnikov	32b7ba66b0	intel/compiler: fix cmod propagation optimisations Knowing following: - CMP writes to flag register the result of applying cmod to the `src0 - src1`. After that it stores the same value to dst. Other instructions first store their result to dst, and then store cmod(dst) to the flag register. - inst is either CMP or MOV - inst->dst is null - inst->src[0] overlaps with scan_inst->dst - inst->src[1] is zero - scan_inst wrote to a flag register There can be three possible paths: - scan_inst is CMP: Considering that src0 is either 0x0 (false), or 0xffffffff (true), and src1 is 0x0: - If inst's cmod is NZ, we can always remove scan_inst: NZ is invariant for false and true. This holds even if src0 is NaN: .nz is the only cmod, that returns true for NaN. - .g is invariant if src0 has a UD type - .l is invariant if src0 has a D type - scan_inst and inst have the same cmod: If scan_inst is anything than CMP, it already wrote the appropriate value to the flag register. - else: We can change cmod of scan_inst to that of inst, and remove inst. It is valid as long as we make sure that no instruction uses the flag register between scan_inst and inst. Nine new cmod_propagation unit tests: - cmp_cmpnz - cmp_cmpg - plnnz_cmpnz - plnnz_cmpz () - plnnz_sel_cmpz - cmp_cmpg_D - cmp_cmpg_UD () - cmp_cmpl_D () - cmp_cmpl_UD () this would fail without changes to brw_fs_cmod_propagation. This fixes optimisation that used to be illegal (see issue #2154) = Before = 0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F 1: cmp.nz.f0.0(8) null:F, vgrf0:F, 0f = After = 0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F Now it is optimised as such (note change of cmod in line 0): = Before = 0: linterp.z.f0.0(8) vgrf0:F, g2:F, attr0<0>:F 1: cmp.nz.f0.0(8) null:F, vgrf0:F, 0f = After = 0: linterp.nz.f0.0(8) vgrf0:F, g2:F, attr0<0>:F No shaderdb changes Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2154 Signed-off-by: Yevhenii Kolesnikov <yevhenii.kolesnikov@globallogic.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3348> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3348>	2020-03-11 21:21:25 +00:00
Matt Turner	e7d0460d58	intel/compiler: Pass backend_shader * to cfg_t() As you can see, not having a pointer to the backend_shader from within the class makes for some weird looking code. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4093>	2020-03-09 04:44:12 +00:00
Jason Ekstrand	f58e0405b6	intel/fs: Drop the gl_program from fs_visitor It's not used by anything anymore now that so much lowering has been moved into NIR. Sadly, we still need on in brw_compile_gs() for geometry shaders on Sandy Bridge. Short of a lot of pointless work, that one's probably not going away. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-08-25 01:02:52 -05:00
Ian Romanick	e13a5c7d67	intel/fs: Allow cmod propagation across reads and writes of different flags This also helps a later patch (intel/fs: Improve discard_if code generation) on about 200 shaders. v2: Document that other instruction sequences are also valid in subtract_merge_with_compare_intervening_mismatch_flag_write. Suggested by Caio. All Intel platforms had similar results. (Ice Lake shown) total instructions in shared programs: 17224438 -> 17224434 (<.01%) instructions in affected programs: 296 -> 292 (-1.35%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 1 x̄: 1.00 x̃: 1 helped stats (rel) min: 0.99% max: 1.92% x̄: 1.43% x̃: 1.40% 95% mean confidence interval for instructions value: -1.00 -1.00 95% mean confidence interval for instructions %-change: -2.04% -0.81% Instructions are helped. total cycles in shared programs: 361468455 -> 361468458 (<.01%) cycles in affected programs: 2862 -> 2865 (0.10%) helped: 2 HURT: 2 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.24% max: 0.39% x̄: 0.31% x̃: 0.31% HURT stats (abs) min: 3 max: 4 x̄: 3.50 x̃: 3 HURT stats (rel) min: 0.32% max: 0.70% x̄: 0.51% x̃: 0.51% 95% mean confidence interval for cycles value: -4.34 5.84 95% mean confidence interval for cycles %-change: -0.70% 0.90% Inconclusive result (value mean confidence interval includes 0). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-05 17:03:45 -07:00
Ian Romanick	8030cb75c1	intel/fs: Fix flag_subreg handling in cmod propagation There were two errors. First, the pass could propagate conditional modifiers from an instruction that writes on flag register to an instruction that writes a different flag register. For example, cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F cmp.nz.f0.1(16) null:F, vgrf6:F, vgrf5:F could be come cmp.nz.f0.0(16) null:F, vgrf6:F, vgrf5:F Second, if an instruction writes f0.1 has it's condition propagated, the modified instruction will incorrectly write flag f0.0. For example, linterp(16) vgrf6:F, g2:F, attr0:F cmp.z.f0.1(16) null:F, vgrf6:F, vgrf5:F (-f0.1) discard_jump(16) (null):UD could become linterp.z.f0.0(16) vgrf6:F, g2:F, attr0:F (-f0.1) discard_jump(16) (null):UD None of these cases will occur currently. The only time we use f0.1 is for generating discard intrinsics. In all those cases, we generate a squence like: cmp.nz.f0.0(16) vgrf7:F, vgrf6:F, vgrf5:F (+f0.1) cmp.z(16) null:D, vgrf7:D, 0d (-f0.1) discard_jump(16) (null):UD Due to the mixed types and incompatible conditions, this sequence would never see any cmod propagation. The next patch will change this. No shader-db changes on any Intel platform. v2: Fix typo in comment in test case subtract_delete_compare_other_flag. Noticed by Caio. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-05 17:03:40 -07:00
Ian Romanick	2dd6013933	intel/fs: Add missing tests for cmod_propagate_not Tests like this should have been added in `4467040cb6` ("i965/fs: Propagate conditional modifiers from not instructions"). Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2019-06-05 17:03:31 -07:00
Ian Romanick	a79570099b	intel/fs: Allow cmod propagation to instructions with saturate modifier v2: Add unit tests. Suggested by Matt. All Intel GPUs had similar results. (Ice Lake shown) total instructions in shared programs: 17229441 -> 17228658 (<.01%) instructions in affected programs: 159574 -> 158791 (-0.49%) helped: 489 HURT: 0 helped stats (abs) min: 1 max: 5 x̄: 1.60 x̃: 1 helped stats (rel) min: 0.07% max: 2.70% x̄: 0.61% x̃: 0.59% 95% mean confidence interval for instructions value: -1.72 -1.48 95% mean confidence interval for instructions %-change: -0.64% -0.58% Instructions are helped. total cycles in shared programs: 360944149 -> 360937144 (<.01%) cycles in affected programs: 1072195 -> 1065190 (-0.65%) helped: 254 HURT: 27 helped stats (abs) min: 2 max: 234 x̄: 30.51 x̃: 9 helped stats (rel) min: 0.04% max: 8.99% x̄: 0.75% x̃: 0.24% HURT stats (abs) min: 2 max: 83 x̄: 27.56 x̃: 24 HURT stats (rel) min: 0.09% max: 3.79% x̄: 1.28% x̃: 1.16% 95% mean confidence interval for cycles value: -30.11 -19.75 95% mean confidence interval for cycles %-change: -0.70% -0.41% Cycles are helped. Reviewed-by: Matt Turner <mattst88@gmail.com> [v1]	2019-05-14 11:38:21 -07:00
Matt Turner	ac21dd4aee	intel/compiler/test: Add unit test for mismatched signedness comparison v2 (idr): Move adding the test to after adding the fix. Reordering the two commits prevents possible headaches for git-bisect with scripts that always do 'ninja check'. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109404 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-15 11:11:02 -08:00
Matt Turner	e50db60d16	intel/compiler/test: Set devinfo->gen = 7 We emit an FBL instruction which only exists since Gen7. This prevents the test from segfaulting when run with TEST_DEBUG=1. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-02-15 11:11:02 -08:00
Ian Romanick	020b0055e7	i965/fs: Propagate conditional modifiers from compares to adds The math inside the add and the cmp in this instruction sequence is the same. We can utilize this to eliminate the compare. add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; cmp.z.f0(8) null<1>F g2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; This is reduced to: add.z.f0(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; This optimization pass could do even better. The nature of converting vectorized code from the GLSL front end to scalar code in NIR results in sequences like: add(8) g7<1>F g4<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; add(8) g6<1>F g3<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; add(8) g5<1>F g2<8,8,1>F g64.5<0,1,0>F { align1 1Q compacted }; cmp.z.f0(8) null<1>F g2<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g8<1>F (abs)g5<8,8,1>F 3e-37F { align1 1Q }; cmp.z.f0(8) null<1>F g3<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g10<1>F (abs)g6<8,8,1>F 3e-37F { align1 1Q }; cmp.z.f0(8) null<1>F g4<8,8,1>F -g64.5<0,1,0>F { align1 1Q switch }; (-f0) sel(8) g12<1>F (abs)g7<8,8,1>F 3e-37F { align1 1Q }; In this sequence, only the first cmp.z is removed. With different scheduling, all 3 could get removed. Skylake total instructions in shared programs: 14407009 -> 14400173 (-0.05%) instructions in affected programs: 1307274 -> 1300438 (-0.52%) helped: 4880 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.40 x̃: 1 helped stats (rel) min: 0.03% max: 8.70% x̄: 0.70% x̃: 0.52% 95% mean confidence interval for instructions value: -1.45 -1.35 95% mean confidence interval for instructions %-change: -0.72% -0.69% Instructions are helped. total cycles in shared programs: 532943169 -> 532923528 (<.01%) cycles in affected programs: 14065798 -> 14046157 (-0.14%) helped: 2703 HURT: 339 helped stats (abs) min: 1 max: 1062 x̄: 12.27 x̃: 2 helped stats (rel) min: <.01% max: 28.72% x̄: 0.38% x̃: 0.21% HURT stats (abs) min: 1 max: 739 x̄: 39.86 x̃: 12 HURT stats (rel) min: 0.02% max: 27.69% x̄: 1.38% x̃: 0.41% 95% mean confidence interval for cycles value: -8.66 -4.26 95% mean confidence interval for cycles %-change: -0.24% -0.14% Cycles are helped. LOST: 0 GAINED: 1 Broadwell total instructions in shared programs: 14719636 -> 14712949 (-0.05%) instructions in affected programs: 1288188 -> 1281501 (-0.52%) helped: 4845 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.38 x̃: 1 helped stats (rel) min: 0.03% max: 8.00% x̄: 0.70% x̃: 0.52% 95% mean confidence interval for instructions value: -1.43 -1.33 95% mean confidence interval for instructions %-change: -0.72% -0.68% Instructions are helped. total cycles in shared programs: 559599253 -> 559581699 (<.01%) cycles in affected programs: 13315565 -> 13298011 (-0.13%) helped: 2600 HURT: 269 helped stats (abs) min: 1 max: 2128 x̄: 12.24 x̃: 2 helped stats (rel) min: <.01% max: 23.95% x̄: 0.41% x̃: 0.20% HURT stats (abs) min: 1 max: 790 x̄: 53.07 x̃: 20 HURT stats (rel) min: 0.02% max: 15.96% x̄: 1.55% x̃: 0.75% 95% mean confidence interval for cycles value: -8.47 -3.77 95% mean confidence interval for cycles %-change: -0.27% -0.18% Cycles are helped. LOST: 0 GAINED: 8 Haswell total instructions in shared programs: 12978609 -> 12973483 (-0.04%) instructions in affected programs: 932921 -> 927795 (-0.55%) helped: 3480 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.47 x̃: 1 helped stats (rel) min: 0.03% max: 7.84% x̄: 0.78% x̃: 0.58% 95% mean confidence interval for instructions value: -1.53 -1.42 95% mean confidence interval for instructions %-change: -0.80% -0.75% Instructions are helped. total cycles in shared programs: 410270788 -> 410250531 (<.01%) cycles in affected programs: 10986161 -> 10965904 (-0.18%) helped: 2087 HURT: 254 helped stats (abs) min: 1 max: 2672 x̄: 14.63 x̃: 4 helped stats (rel) min: <.01% max: 39.61% x̄: 0.42% x̃: 0.21% HURT stats (abs) min: 1 max: 519 x̄: 40.49 x̃: 16 HURT stats (rel) min: 0.01% max: 12.83% x̄: 1.20% x̃: 0.47% 95% mean confidence interval for cycles value: -12.82 -4.49 95% mean confidence interval for cycles %-change: -0.31% -0.18% Cycles are helped. LOST: 0 GAINED: 5 Ivy Bridge total instructions in shared programs: 11686082 -> 11681548 (-0.04%) instructions in affected programs: 937696 -> 933162 (-0.48%) helped: 3150 HURT: 0 helped stats (abs) min: 1 max: 33 x̄: 1.44 x̃: 1 helped stats (rel) min: 0.03% max: 7.84% x̄: 0.69% x̃: 0.49% 95% mean confidence interval for instructions value: -1.49 -1.38 95% mean confidence interval for instructions %-change: -0.71% -0.67% Instructions are helped. total cycles in shared programs: 257514962 -> 257492471 (<.01%) cycles in affected programs: 11524149 -> 11501658 (-0.20%) helped: 1970 HURT: 239 helped stats (abs) min: 1 max: 3525 x̄: 17.48 x̃: 3 helped stats (rel) min: <.01% max: 49.60% x̄: 0.46% x̃: 0.17% HURT stats (abs) min: 1 max: 1358 x̄: 50.00 x̃: 15 HURT stats (rel) min: 0.02% max: 59.88% x̄: 1.84% x̃: 0.65% 95% mean confidence interval for cycles value: -17.01 -3.35 95% mean confidence interval for cycles %-change: -0.33% -0.08% Cycles are helped. LOST: 9 GAINED: 1 Sandy Bridge total instructions in shared programs: 10432841 -> 10429893 (-0.03%) instructions in affected programs: 685071 -> 682123 (-0.43%) helped: 2453 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 1.20 x̃: 1 helped stats (rel) min: 0.02% max: 7.55% x̄: 0.64% x̃: 0.46% 95% mean confidence interval for instructions value: -1.23 -1.17 95% mean confidence interval for instructions %-change: -0.67% -0.62% Instructions are helped. total cycles in shared programs: 146133660 -> 146134195 (<.01%) cycles in affected programs: 3991634 -> 3992169 (0.01%) helped: 1237 HURT: 153 helped stats (abs) min: 1 max: 2853 x̄: 6.93 x̃: 2 helped stats (rel) min: <.01% max: 29.00% x̄: 0.24% x̃: 0.14% HURT stats (abs) min: 1 max: 1740 x̄: 59.56 x̃: 12 HURT stats (rel) min: 0.03% max: 78.98% x̄: 1.96% x̃: 0.42% 95% mean confidence interval for cycles value: -5.13 5.90 95% mean confidence interval for cycles %-change: -0.17% 0.16% Inconclusive result (value mean confidence interval includes 0). LOST: 0 GAINED: 1 GM45 and Iron Lake had similar results (GM45 shown): total instructions in shared programs: 4800332 -> 4798380 (-0.04%) instructions in affected programs: 565995 -> 564043 (-0.34%) helped: 1451 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 1.35 x̃: 1 helped stats (rel) min: 0.05% max: 5.26% x̄: 0.47% x̃: 0.31% 95% mean confidence interval for instructions value: -1.40 -1.29 95% mean confidence interval for instructions %-change: -0.50% -0.45% Instructions are helped. total cycles in shared programs: 122032318 -> 122027798 (<.01%) cycles in affected programs: 8334868 -> 8330348 (-0.05%) helped: 1029 HURT: 1 helped stats (abs) min: 2 max: 40 x̄: 4.43 x̃: 2 helped stats (rel) min: <.01% max: 1.83% x̄: 0.09% x̃: 0.04% HURT stats (abs) min: 38 max: 38 x̄: 38.00 x̃: 38 HURT stats (rel) min: 0.25% max: 0.25% x̄: 0.25% x̃: 0.25% 95% mean confidence interval for cycles value: -4.70 -4.08 95% mean confidence interval for cycles %-change: -0.09% -0.08% Cycles are helped. Signed-off-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com>	2018-03-26 08:50:43 -07:00
Jason Ekstrand	700bebb958	i965: Move the back-end compiler to src/intel/compiler Mostly a dummy git mv with a couple of noticable parts: - With the earlier header cleanups, nothing in src/intel depends files from src/mesa/drivers/dri/i965/ - Both Autoconf and Android builds are addressed. Thanks to Mauro and Tapani for the fixups in the latter - brw_util.[ch] is not really compiler specific, so it's moved to i965. v2: - move brw_eu_defines.h instead of brw_defines.h - remove no-longer applicable includes - add missing vulkan/ prefix in the Android build (thanks Tapani) v3: - don't list brw_defines.h in src/intel/Makefile.sources (Jason) - rebase on top of the oa patches [Emil Velikov: commit message, various small fixes througout] Signed-off-by: Emil Velikov <emil.velikov@collabora.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-13 11:16:34 +00:00

28 Commits