KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Samuel Pitoiset	af2951dde8	radv/ci: update list of expected failures Add dEQP-VK.glsl.builtin.precision_double.determinant.compute.mat3 which fails on all generations. It looks like CTS should relax tolerance slightly. Co-authored-by: Charlie Turner <cturner@igalia.com> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Martin Roukala <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234>	2022-03-04 18:43:18 +01:00
Samuel Pitoiset	51c6fdf708	radv/ci: skip dEQP-VK.renderpass2.depth_stencil_resolve.*_samplemask They randomly hang on Navi10 and randomly fail on Sienna Cichlid. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Martin Roukala <martin.roukala@mupuf.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15234>	2022-03-04 18:43:16 +01:00
Juan A. Suarez Romero	7ffee7f1ab	v3d: rebind sampler view if resource changed the BO When discarding the whole resource to create a new one, if this resource is used by a sampler view, a rebind must be done to use the new resource. But this must be done when setting the sampler views, because we don't have access to those samplers before. v2: - Pack shader state on setting sampler views (Iago) - Use a serial ID to know when to rebind sampler views (Juan) v3: - Move check to caller (Iago) - Keep rebind sampler view on BO change (Iago) v4: - Rename "serial_bo" to "serial_id" (Iago) - Add comments (Iago) Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/6027 Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15171>	2022-03-04 17:20:28 +00:00
Alyssa Rosenzweig	7bda838c56	panfrost: Push twice as many uniforms The limit for Bifrost is twice as high as previously thought -- the limit is 64 slots of FAU, not 64 words. Each slot is 2 words. We can push twice as much, saving a considerable number of cycles in some cases. total instructions in shared programs: 2454260 -> 2431502 (-0.93%) instructions in affected programs: 845176 -> 822418 (-2.69%) helped: 3376 HURT: 304 helped stats (abs) min: 1.0 max: 60.0 x̄: 7.92 x̃: 6 helped stats (rel) min: 0.13% max: 45.45% x̄: 4.60% x̃: 4.11% HURT stats (abs) min: 1.0 max: 60.0 x̄: 13.06 x̃: 8 HURT stats (rel) min: 0.16% max: 35.09% x̄: 7.58% x̃: 6.52% 95% mean confidence interval for instructions value: -6.50 -5.87 95% mean confidence interval for instructions %-change: -3.75% -3.43% Instructions are helped. total tuples in shared programs: 1963383 -> 1951560 (-0.60%) tuples in affected programs: 638622 -> 626799 (-1.85%) helped: 2959 HURT: 573 helped stats (abs) min: 1.0 max: 54.0 x̄: 5.61 x̃: 4 helped stats (rel) min: 0.15% max: 28.57% x̄: 3.61% x̃: 3.12% HURT stats (abs) min: 1.0 max: 50.0 x̄: 8.35 x̃: 6 HURT stats (rel) min: 0.25% max: 27.34% x̄: 6.24% x̃: 4.92% 95% mean confidence interval for tuples value: -3.61 -3.08 95% mean confidence interval for tuples %-change: -2.18% -1.85% Tuples are helped. total clauses in shared programs: 387817 -> 365111 (-5.85%) clauses in affected programs: 135527 -> 112821 (-16.75%) helped: 3489 HURT: 25 helped stats (abs) min: 1.0 max: 43.0 x̄: 6.52 x̃: 5 helped stats (rel) min: 0.82% max: 58.33% x̄: 17.48% x̃: 15.87% HURT stats (abs) min: 1.0 max: 3.0 x̄: 1.56 x̃: 1 HURT stats (rel) min: 2.94% max: 11.11% x̄: 6.87% x̃: 6.67% 95% mean confidence interval for clauses value: -6.67 -6.26 95% mean confidence interval for clauses %-change: -17.65% -16.96% Clauses are helped. total cycles in shared programs: 201842.21 -> 168754.04 (-16.39%) cycles in affected programs: 84035.50 -> 50947.33 (-39.37%) helped: 3547 HURT: 136 helped stats (abs) min: 0.041665999999999315 max: 54.0 x̄: 9.33 x̃: 8 helped stats (rel) min: 0.17% max: 80.77% x̄: 36.10% x̃: 36.84% HURT stats (abs) min: 0.041665999999999315 max: 1.0 x̄: 0.12 x̃: 0 HURT stats (rel) min: 0.18% max: 12.24% x̄: 1.18% x̃: 0.61% 95% mean confidence interval for cycles value: -9.26 -8.71 95% mean confidence interval for cycles %-change: -35.34% -34.11% Cycles are helped. total arith in shared programs: 74918.46 -> 75022.62 (0.14%) arith in affected programs: 22471.04 -> 22575.21 (0.46%) helped: 1571 HURT: 1492 helped stats (abs) min: 0.041665999999999315 max: 1.125 x̄: 0.17 x̃: 0 helped stats (rel) min: 0.17% max: 40.00% x̄: 2.50% x̃: 1.96% HURT stats (abs) min: 0.041665999999999315 max: 2.375 x̄: 0.25 x̃: 0 HURT stats (rel) min: 0.16% max: 100.00% x̄: 5.35% x̃: 2.37% 95% mean confidence interval for arith value: 0.02 0.05 95% mean confidence interval for arith %-change: 1.08% 1.56% Arith are HURT. total ldst in shared programs: 174812 -> 137889 (-21.12%) ldst in affected programs: 81319 -> 44396 (-45.41%) helped: 3722 HURT: 0 helped stats (abs) min: 1.0 max: 62.0 x̄: 9.92 x̃: 8 helped stats (rel) min: 1.82% max: 100.00% x̄: 47.18% x̃: 43.75% 95% mean confidence interval for ldst value: -10.20 -9.64 95% mean confidence interval for ldst %-change: -47.97% -46.39% Ldst are helped. total quadwords in shared programs: 1757124 -> 1714130 (-2.45%) quadwords in affected programs: 584065 -> 541071 (-7.36%) helped: 3474 HURT: 173 helped stats (abs) min: 1.0 max: 90.0 x̄: 12.66 x̃: 9 helped stats (rel) min: 0.26% max: 34.18% x̄: 8.78% x̃: 8.33% HURT stats (abs) min: 1.0 max: 26.0 x̄: 5.76 x̃: 4 HURT stats (rel) min: 0.45% max: 20.66% x̄: 4.48% x̃: 2.63% 95% mean confidence interval for quadwords value: -12.21 -11.37 95% mean confidence interval for quadwords %-change: -8.36% -7.95% Quadwords are helped. total threads in shared programs: 52898 -> 53142 (0.46%) threads in affected programs: 262 -> 506 (93.13%) helped: 250 HURT: 6 helped stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: 0.92 0.99 95% mean confidence interval for threads %-change: 93.69% 99.28% Threads are helped. total spills in shared programs: 161 -> 107 (-33.54%) spills in affected programs: 54 -> 0 helped: 27 HURT: 0 total fills in shared programs: 1386 -> 796 (-42.57%) fills in affected programs: 590 -> 0 helped: 27 HURT: 0 Fixes: `d4dccea0ba` ("panfrost: Add UBO push data structure") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239>	2022-03-04 15:22:04 +00:00
Alyssa Rosenzweig	e7cfe18099	pan/bi: Run CSE after lowering FAU Lowering FAU can add moves from uniforms. If a uniform is moved out to a register mulitple times in a basic block, these moves can be CSE'd, saving instructions at the cost of register pressure. 854 shaders in my shader-db are helped on cycle count (average 2.94% reduction in cycles). Only 9 shaders have hurt thread count, and there is no change in spills or fills. Overall, this seems to be a win. Prevents instruction count regressions from the next commit. total instructions in shared programs: 2454423 -> 2444690 (-0.40%) instructions in affected programs: 386274 -> 376541 (-2.52%) helped: 2105 HURT: 0 helped stats (abs) min: 1.0 max: 116.0 x̄: 4.62 x̃: 2 helped stats (rel) min: 0.04% max: 27.27% x̄: 3.64% x̃: 1.92% 95% mean confidence interval for instructions value: -4.91 -4.33 95% mean confidence interval for instructions %-change: -3.83% -3.45% Instructions are helped. total tuples in shared programs: 1963534 -> 1957106 (-0.33%) tuples in affected programs: 233562 -> 227134 (-2.75%) helped: 1491 HURT: 117 helped stats (abs) min: 1.0 max: 63.0 x̄: 4.44 x̃: 2 helped stats (rel) min: 0.04% max: 24.53% x̄: 4.39% x̃: 2.59% HURT stats (abs) min: 1.0 max: 5.0 x̄: 1.61 x̃: 1 HURT stats (rel) min: 0.18% max: 8.33% x̄: 1.44% x̃: 1.05% 95% mean confidence interval for tuples value: -4.28 -3.71 95% mean confidence interval for tuples %-change: -4.20% -3.73% Tuples are helped. total clauses in shared programs: 387848 -> 387079 (-0.20%) clauses in affected programs: 13718 -> 12949 (-5.61%) helped: 583 HURT: 60 helped stats (abs) min: 1.0 max: 16.0 x̄: 1.42 x̃: 1 helped stats (rel) min: 1.11% max: 25.00% x̄: 8.28% x̃: 6.67% HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 0.86% max: 20.00% x̄: 4.58% x̃: 4.00% 95% mean confidence interval for clauses value: -1.29 -1.10 95% mean confidence interval for clauses %-change: -7.57% -6.58% Clauses are helped. total cycles in shared programs: 201866.21 -> 201682.92 (-0.09%) cycles in affected programs: 6241.79 -> 6058.50 (-2.94%) helped: 952 HURT: 98 helped stats (abs) min: 0.04166399999999726 max: 2.625 x̄: 0.20 x̃: 0 helped stats (rel) min: 0.12% max: 26.00% x̄: 4.05% x̃: 2.38% HURT stats (abs) min: 0.041665999999999315 max: 0.16666700000000034 x̄: 0.07 x̃: 0 HURT stats (rel) min: 0.18% max: 8.70% x̄: 1.60% x̃: 1.43% 95% mean confidence interval for cycles value: -0.19 -0.16 95% mean confidence interval for cycles %-change: -3.80% -3.24% Cycles are helped. total arith in shared programs: 74924.00 -> 74660.12 (-0.35%) arith in affected programs: 9303.67 -> 9039.79 (-2.84%) helped: 1513 HURT: 118 helped stats (abs) min: 0.04166399999999726 max: 2.625 x̄: 0.18 x̃: 0 helped stats (rel) min: 0.07% max: 33.33% x̄: 4.68% x̃: 2.67% HURT stats (abs) min: 0.041665999999999315 max: 0.16666800000000137 x̄: 0.07 x̃: 0 HURT stats (rel) min: 0.18% max: 8.70% x̄: 1.55% x̃: 1.37% 95% mean confidence interval for arith value: -0.17 -0.15 95% mean confidence interval for arith %-change: -4.48% -3.98% Arith are helped. total quadwords in shared programs: 1757254 -> 1751978 (-0.30%) quadwords in affected programs: 197399 -> 192123 (-2.67%) helped: 1464 HURT: 110 helped stats (abs) min: 1.0 max: 51.0 x̄: 3.73 x̃: 2 helped stats (rel) min: 0.04% max: 21.95% x̄: 4.16% x̃: 2.52% HURT stats (abs) min: 1.0 max: 7.0 x̄: 1.71 x̃: 1 HURT stats (rel) min: 0.21% max: 13.04% x̄: 1.65% x̃: 0.93% 95% mean confidence interval for quadwords value: -3.58 -3.13 95% mean confidence interval for quadwords %-change: -3.97% -3.53% Quadwords are helped. total threads in shared programs: 52899 -> 52890 (-0.02%) threads in affected programs: 18 -> 9 (-50.00%) helped: 0 HURT: 9 HURT stats (abs) min: 1.0 max: 1.0 x̄: 1.00 x̃: 1 HURT stats (rel) min: 50.00% max: 50.00% x̄: 50.00% x̃: 50.00% 95% mean confidence interval for threads value: -1.00 -1.00 95% mean confidence interval for threads %-change: -50.00% -50.00% Threads are HURT. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15239>	2022-03-04 15:22:04 +00:00
Henry Goffin	c8f644ec44	frontends/va: ignore incoming frame_num from VA picture parameters The Gallium pipe video "frame_num" variable is internally used as a counter of elapsed reference frames since the last IDR. The incoming frame_num field from VA picture parameters is not equivalent; the VA value may wrap to zero prematurely, as it is a 16-bit struct field with a documented max value of 2^(log2_max_frame_num_minus4 + 4)-1. This change improves "infinite GOP" single-client live streaming, where it is reasonable for the server to desire an endless series of P-frames without IDR. Without this change, it is difficult/impossible for an application to encode a P- or B-frame after the VA frame_num field wraps around to zero, depending on the backend encoder implementation. This change has no effect on existing applications that always signal an IDR frame and reset the VA frame_num to zero before it wraps around. For example, the FFmpeg vaapi encoder ignores the VA documentation and sends an un-wrapped VA frame_num, which results in identical computation of the internal frame_num (as long as each GOP is less than 65536 frames). Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5768 Reviewed-by: Thong Thai <thong.thai@amd.com> patch revision 3: correctly avoid incrementing frame_num when the encoded frame is not a reference, per h264 spec and ffmpeg behavior Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14332>	2022-03-04 14:17:20 +00:00
Rhys Perry	d28b6b6856	aco: rework removal of jumps over branches Only allow this in situations where we know it's safe. In particular, this stops removal of unconditional branches like with block_kind_continue_or_break. Fixes dEQP-VK.graphicsfuzz.fragcoord-control-flow hang. fossil-db (Sienna Cichlid): Totals from 34 (0.02% of 162293) affected shaders: Instrs: 84115 -> 84178 (+0.07%); split: -0.00%, +0.08% CodeSize: 463372 -> 463624 (+0.05%); split: -0.00%, +0.06% Latency: 3467316 -> 3467652 (+0.01%) InvThroughput: 3085493 -> 3085578 (+0.00%) Branches: 3221 -> 3284 (+1.96%); split: -0.03%, +1.99% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `f030b75b7d` ("aco: relax condition to remove branches in case of few instructions") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15214>	2022-03-04 12:32:36 +00:00
Samuel Pitoiset	059f870d74	ac/nir: implement nir_op_pack_{uint,sint}_2x16 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>	2022-03-04 08:06:56 +00:00
Samuel Pitoiset	9b113f1b6c	aco: implement nir_op_pack_{uint,sint}_2x16 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>	2022-03-04 08:06:56 +00:00
Samuel Pitoiset	6532307555	nir: introduce nir_pack_{sint,uint}_2x16 instructions These instructions have AMD hardware equivalent and they will be used to lower fragment shader outputs in NIR. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15231>	2022-03-04 08:06:56 +00:00
Xiaohui Gu	4d81c60e11	iris: Mark a dirty update when vs_needs_sgvs_element value changed Add vs_needs_sgvs_element value check when updating vertex element dirty state in iris_update_compiled_vs to solve render error of Android game "Genshin Impact". Signed-off-by: Xiaohui Gu <xiaohui.gu@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15142>	2022-03-04 05:41:38 +00:00
Yiwei Zhang	aaa25cda0b	venus: add VK_EXT_image_robustness support Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>	2022-03-04 01:04:13 +00:00
Yiwei Zhang	ba212bf888	venus: add VK_EXT_provoking_vertex support Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>	2022-03-04 01:04:13 +00:00
Yiwei Zhang	33ba61b059	venus: add VK_EXT_line_rasterization support Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>	2022-03-04 01:04:13 +00:00
Yiwei Zhang	58182eb096	venus: update to latest venus protocol Added the below extension support: - VK_EXT_line_rasterization - VK_EXT_provoking_vertex Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>	2022-03-04 01:04:13 +00:00
Yiwei Zhang	20efd9eff3	venus: group extensions promoted to 1.3 Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>	2022-03-04 01:04:13 +00:00
Yiwei Zhang	fe3815b7fa	venus: clean up physical device features and properties Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15205>	2022-03-04 01:04:13 +00:00
Daniel Schürmann	ca4595e01a	nir/opt_shrink_vectors: update docstring in order to reflect the various recent improvements. Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>	2022-03-04 00:18:58 +00:00
Daniel Schürmann	405829cd85	nir/opt_shrink_vectors: remove duplicate components from vecN vecN instructions which are only used by other ALU will now get duplicate channels removed. i915g: total instructions in shared programs: 396309 -> 396294 (<.01%) instructions in affected programs: 186 -> 171 (-8.06%) r300: total instructions in shared programs: 1165059 -> 1164354 (-0.06%) instructions in affected programs: 35884 -> 35179 (-1.96%) total temps in shared programs: 165497 -> 165326 (-0.10%) temps in affected programs: 2990 -> 2819 (-5.72%) softpipe: total instructions in shared programs: 2860028 -> 2859084 (-0.03%) instructions in affected programs: 55539 -> 54595 (-1.70%) total temps in shared programs: 516939 -> 516546 (-0.08%) temps in affected programs: 6623 -> 6230 (-5.93%) Acked-by: Emma Anholt <emma@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>	2022-03-04 00:18:58 +00:00
Daniel Schürmann	e5963478c2	nir/opt_shrink_vectors: shrink load_const properly This patch enables removal of arbitrary channels in load_const instructions, if they are either unused or duplicates of other channels and only used by ALU. Totals from 692 (0.51% of 134913) affected shaders: (GFX10.3) VGPRs: 21832 -> 21544 (-1.32%) CodeSize: 1322016 -> 1313080 (-0.68%); split: -0.68%, +0.01% Instrs: 243635 -> 242231 (-0.58%); split: -0.58%, +0.00% Latency: 1856138 -> 1857237 (+0.06%); split: -0.09%, +0.15% InvThroughput: 424298 -> 421671 (-0.62%); split: -0.62%, +0.01% VClause: 4580 -> 4583 (+0.07%); split: -0.02%, +0.09% SClause: 14336 -> 14354 (+0.13%); split: -0.04%, +0.17% Copies: 8897 -> 8859 (-0.43%); split: -0.45%, +0.02% PreSGPRs: 20439 -> 20437 (-0.01%) PreVGPRs: 16011 -> 15907 (-0.65%); split: -0.97%, +0.32% i915g: total instructions in shared programs: 396471 -> 396309 (-0.04%) instructions in affected programs: 6408 -> 6246 (-2.53%) total const in shared programs: 56458 -> 56422 (-0.06%) const in affected programs: 407 -> 371 (-8.85%) LOST: shaders/closed/steam/trine-2/fp-3.shader_test FS r300: total instructions in shared programs: 1164421 -> 1165059 (0.05%) instructions in affected programs: 143981 -> 144619 (0.44%) total temps in shared programs: 165488 -> 165497 (<.01%) temps in affected programs: 318 -> 327 (2.83%) total consts in shared programs: 922140 -> 921952 (-0.02%) consts in affected programs: 12438 -> 12250 (-1.51%) softpipe: total instructions in shared programs: 2859978 -> 2860028 (<.01%) instructions in affected programs: 183355 -> 183405 (0.03%) total temps in shared programs: 517071 -> 516939 (-0.03%) temps in affected programs: 1416 -> 1284 (-9.32%) total imm in shared programs: 103601 -> 102767 (-0.81%) imm in affected programs: 3928 -> 3094 (-21.23%) Acked-by: Emma Anholt <emma@anholt.net> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12468>	2022-03-04 00:18:58 +00:00
Dave Airlie	a10b5d7086	crocus: change the line width workaround for gfx4/5 This fixes piglit line-flat-clip-color and the hud fps counter. Fixes: `6b7a68b7c2` ("crocus: add missing line smooth bits.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15229>	2022-03-04 00:06:28 +00:00
Chia-I Wu	bbbbf39559	venus: abort when stuck This gives MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 4096 MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 8192 MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 12288 MESA-VIRTIO: debug: stuck in ring seqno wait with iter at 16384 MESA-VIRTIO: debug: aborting Aborted which should be more friendly than printing the messages forever. On my i7-7820HQ, this aborts after roughly 4+8+16+32=60 seconds Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15200>	2022-03-03 21:48:13 +00:00
Daniel Schürmann	ccf4bcd162	aco/ra: don't immediately assign a register for p_branch These get now assigned after handling phis. Totals from 564 (0.42% of 134913) affected shaders: (GFX10.3) CodeSize: 5519744 -> 5515308 (-0.08%) Instrs: 1063045 -> 1061936 (-0.10%) Latency: 11880452 -> 11875904 (-0.04%) InvThroughput: 2259933 -> 2259581 (-0.02%); split: -0.02%, +0.00% Copies: 86908 -> 85799 (-1.28%) Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>	2022-03-03 20:21:08 +00:00
Rhys Perry	c3070773f8	aco/tests: add test for branch definition RA Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>	2022-03-03 20:21:08 +00:00
Rhys Perry	32d0bae8ec	aco: fix branch definition validation Like how they have to be register allocated differently, branch definitions at merge block predecessors need to be validated differently. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>	2022-03-03 20:21:08 +00:00
Rhys Perry	bed5a31005	aco: add validate_instr_defs() Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>	2022-03-03 20:21:08 +00:00
Rhys Perry	d5349a99c2	aco/ra: fix register allocation of branch definitions fossil-db (Sienna Cichlid): Totals from 704 (0.52% of 134913) affected shaders: CodeSize: 7177288 -> 7182072 (+0.07%); split: -0.00%, +0.07% Instrs: 1371781 -> 1372977 (+0.09%); split: -0.00%, +0.09% Latency: 17993572 -> 18001344 (+0.04%); split: -0.00%, +0.04% InvThroughput: 4198996 -> 4199569 (+0.01%); split: -0.00%, +0.01% Copies: 122456 -> 123516 (+0.87%); split: -0.01%, +0.88% Branches: 43815 -> 43818 (+0.01%); split: -0.02%, +0.03% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>	2022-03-03 20:21:08 +00:00
Rhys Perry	608d48b787	aco/ra: add get_reg_phi() helper Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>	2022-03-03 20:21:08 +00:00
Rhys Perry	ceca5e68c4	aco: remove vcc hint from branch definitions This doesn't seem to have much benefit anymore. fossil-db (Sienna Cichlid): Totals from 198 (0.15% of 134913) affected shaders: CodeSize: 2610536 -> 2610872 (+0.01%); split: -0.01%, +0.02% Instrs: 479001 -> 479085 (+0.02%); split: -0.01%, +0.03% Latency: 7310684 -> 7300735 (-0.14%); split: -0.16%, +0.02% InvThroughput: 2439084 -> 2437446 (-0.07%); split: -0.07%, +0.00% SClause: 14760 -> 14722 (-0.26%) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13432>	2022-03-03 20:21:08 +00:00
Pavel Ondračka	558f632967	r300: schedule TEX instructions before OUT instructions NIR-to-TGSI produces partial output writes contrary to the old paths that always wrote the full outputs. Therefore if there is now a partial output write ready to be scheduled and nothing else besides a tex is ready, we would schedule the output write first. This was not a problem before as usually at last some component of the full output write depended on the tex result. This is not optimal from the performance point of view and resulted in ~20% slowdown in the Unigine demos. The docs say: The first OUTPUT instruction will reserve space in the output register fifo. This space is limited, therefore issuing an OUTPUT earlier than necessary may cause threads to stall earlier than necessary. You should not set an ALU instruction as type OUTPUT unless it is actually writing to an output register, or it is the last instruction of the program. Fix it by explicitly prefering a TEX before OUT and restore the performance: 9.66 -> 12.12 fps (as compared to 11.83 with the old glsl-to-TGSI path) in Unigine Sanctuary. No change in Lightsmark or GLmark. This is also a win from the intructions point of view as we are usually able to schedule the partial output writes in a single pair at the end. total instructions in shared programs: 106009 -> 105891 (-0.11%) instructions in affected programs: 10153 -> 10035 (-1.16%) helped: 118 HURT: 0 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5840 Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15165>	2022-03-03 20:05:32 +00:00
Pavel Ondračka	aff1a85c09	r300: remove some dead logic in tex pair scheduling The max_score == -1 condition is already before so this will never trigger. Its unclear what was the intention anyway. Now we emit either: - if we have accumulated enough tex intructions for a full block - if we have nothing else to emit - or if we can emit all remaining tex instructions already. Signed-off-by: Pavel Ondračka <pavel.ondracka@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15165>	2022-03-03 20:05:32 +00:00
Igor Torrente	688b23885b	Venus: Add `vn_physical_device_{features, properties}` for better organization New extensions properties/feature are being put in the `vn_physical_device` which is not ideal from an organization point of view. Here the `vn_physical_device_{features,properties}` are two new struct to help the `vn_physical_device` organzation. Signed-off-by: Igor Torrente <igor.torrente@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15170>	2022-03-03 19:43:52 +00:00
Ilia Mirkin	539fae796a	freedreno/a4xx: fix integer tg4 Something is slightly off in the integer values returned. It passes many tests without the fixup, but the dEQP-GLES31 tests complain. The blob ends up doing 3x gathers, and selects between them based on getinfo results. Since we already have a per-sampler key with some spare bits, just stick the bit-size info in there. And we can derive signedness from the associated type info. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>	2022-03-03 18:26:43 +00:00
Ilia Mirkin	96211adf77	freedreno/a4xx: add swizzles to shader keys for tg4 workaround Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>	2022-03-03 18:26:43 +00:00
Ilia Mirkin	68a2d25d0d	freedreno/a4xx: move tex_type to header This will be used in several places. Factor it out for common use. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>	2022-03-03 18:26:43 +00:00
Ilia Mirkin	8ed07c0da9	nir: remove bogus logic to allow cube + offset to work This was done for an a4xx hack which is now removed. No API allows cube texturing to have offsets. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>	2022-03-03 18:26:43 +00:00
Ilia Mirkin	37306ba3f1	freedreno/ir3: remove bogus tg4 -> tex lowering pass It can't be done. This just provides bad results. The blob had a comparable approach where they fixed up coordinates, but that also can't work with a separate texture definition with nearest filtering. By then, might as well provide a unswizzled variant instead, and using native functionality. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14670>	2022-03-03 18:26:43 +00:00
Alex Xu (Hello71)	80bf9c7b97	r300/compiler/tests: print regoff_t as size_t fixes compilation on musl Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13949>	2022-03-03 17:48:17 +00:00
Samuel Pitoiset	516aee64cc	radv,aco: do not lower nir_op_pack_{unorm,snorm}_2x16 v_cvt_pknorm_{u16,i16}_f32 can be emitted instead, it's supported on all generations. No fossils-db changes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15215>	2022-03-03 14:54:12 +01:00
Michel Zou	f1f1b3d7f8	vulkan/wsi: drop unused wsi_create_win32_image fixes: `ed391d2a` Reviewed-by: Jason Ekstrand <jason.ekstrand@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15088>	2022-03-03 06:13:07 +00:00
Andrii Simiklit	ddf2778269	glsl: add member's location layout qualifier rules for `arrayed` in/out blocks From Section 4.4.1 (Input Layout Qualifiers) of the GLSL 4.50 spec: "For some blocks declared as arrays, the location can only be applied at the block level: When a block is declared as an array where additional locations are needed for each member for each block array element, it is a compile-time error to specify locations on the block members. That is, when locations would be under specified by applying them on block members, they are not allowed on block members. For arrayed interfaces (those generally having an extra level of arrayness due to interface expansion), the outer array is stripped before applying this rule" From Section 1.2.1 (Changes from Revision 6 of GLSL Version) of the GLSL 4.50 spec: "Private Bug 15678: Don’t allow location = on block members where the block needs an array of locations" From Section 4.4.1 (Input Layout Qualifiers) of the GLSL ES 3.20 spec "If an input is declared as an array of blocks, excluding per-vertex-arrays as required for tessellation, it is an error to declare a member of the block with a location qualifier" From Section 1.1.3 (Changes from GLSL ES 3.2 revision 3) of the GLSL ES 3.20 spec: "Arrayed blocks cannot have layout location qualifiers on members" Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11522>	2022-03-03 05:42:45 +00:00
Mike Blumenkrantz	0313110c92	zink: ci updates Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>	2022-03-03 05:21:40 +00:00
Mike Blumenkrantz	712ce86bd1	zink: split primitives generated queries if xfb/gs states change if one of these states change then it affects which result needs to be used for that query, so split it up over multiple query ids to make sure the correct result is obtained fixes (lavapipe): GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_pause_resume GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_states Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>	2022-03-03 05:21:40 +00:00
Mike Blumenkrantz	0cb3ae949c	zink: split out query suspending into util function no functional changes Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>	2022-03-03 05:21:40 +00:00
Mike Blumenkrantz	5aecec48ee	zink: update query states before starting renderpass during draw this gives some leeway for doing transfer ops without crashing the renderpass Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15227>	2022-03-03 05:21:40 +00:00
Ilia Mirkin	965ab44c50	nvc0: disable EXT_texture_sRGB_RG8 Looks like the green component doesn't get srgb-decoding, and no obvious way to force it. It works fine on nv50 though. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15211>	2022-03-03 04:37:12 +00:00
Ilia Mirkin	897a7fbbf1	mesa: enable GL_EXT_texture_sRGB_RG8 on desktop Looks like an extension number was assigned in late 2020. This makes it possible to hook up this format to teximage-colors without teaching it about ES. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15211>	2022-03-03 04:37:12 +00:00
Mike Blumenkrantz	af5f49f663	zink: remove loop from generated tcs this is already using per-vertex io, no need to add conditionals to verify Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15225>	2022-03-03 02:58:43 +00:00
Rob Clark	7e63fa2bb1	freedreno/registers: Add a couple regs we need for kernel Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15221>	2022-03-03 02:19:47 +00:00
Dave Airlie	34379a937f	gallivm/llvmpipe: add support for NIR to the linear/aos paths. When the AOS/linear code was added it only worked with TGSI which meant nothing in mesa upstream was really using it. This adds support to analyse NIR shaders, and adds aos support to the backend. AOS support is limited to mov,vec,fmul,tex sampling in order to accelerate mostly compositing operations. I've tested weston uses the fast path. gnome-shell can't use it yet as we can't optimise the depth test paths. Acked-by: Jose Fonseca <jfonseca@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15140>	2022-03-03 01:39:39 +00:00

1 2 3 4 5 ...

150783 Commits All Branches Search

150783 Commits

All Branches