KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Rhys Perry	0f5d90c2a7	ac/nir: fix store_buffer_amd write_masks Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>	2022-01-10 19:01:04 +00:00
Rhys Perry	b00138090e	nir/lower_shader_calls: fix store_scratch write_mask Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>	2022-01-10 19:01:04 +00:00
Lucas Stach	d799a4be27	etnaviv: drm: defer destruction of softpin BOs When destroying a BO with a userspace managed address and thus freeing the VMA space, we need to make sure that the BO isn't in use by any active submit anymore, as the kernel will rightfully reject the next submit that re-uses the still active VMA. Keep the BO alive as long as it isn't fully idle to prevent the VMA being reused prematurely. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>	2022-01-10 16:49:00 +00:00
Lucas Stach	98a2049c08	etnaviv: drm: rename _etna_bo_del Rename it to a somwhat more descriptive name, which makes it easier to distinguish between the etna_bo_del function in the public interface and the internal function. Also remove the duplicated forward declaration and move it to the common interal header. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>	2022-01-10 16:49:00 +00:00
Lucas Stach	77ebbcbf9a	etnaviv: drm: export BO idle check function The ability to check if a BO is idle is not only useful in the buffer cache, but also in other parts of the winsys and even the pipe driver. Make this functionality available in the interface. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>	2022-01-10 16:49:00 +00:00
Lucas Stach	1b1f8592c0	etnaviv: drm: properly handle reviving BOs via a lookup If a BO is removed from a cache bucket list via a lookup, we must handle it in the same way as if a allocation from the cache happened: tell valgrind that the buffer is active again and take a reference to the etna_device, which the BO had given up while being in the cache. Cc: mesa-stable Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>	2022-01-10 16:49:00 +00:00
Lucas Stach	ccfd5054a4	etnaviv: drm: fix size limit in etna_cmd_stream_realloc The intended limit for command stream size is 64KB, as this is what old kernels can reliably do and what allows for maximum number of queued streams on newer kernels. However, due to unit confusion with the size member, which is in dwords, the submitted streams could grow up to ~128KB. Fix this by using the proper limit in dwords. Flushing due to some limits being exceeded is not an issue, but is expected with certain workloads, so lower the severity of the message being emitted in this case to debug level. Cc: mesa-stable Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14425>	2022-01-10 15:38:27 +00:00
Lucas Stach	22d796feb8	egl/wayland: break double/tripple buffering feedback loops Currently we dispose any unneeded color buffers immediately if we detect that there are more unlocked buffers than we need. This can lead to feedback loops between the compositor and the application causing rapid toggling between double and tripple buffering. Scenario: 2 buffers already queued to the compositor, egl/wayland allocates a new back buffer to avoid throttling, slowing down the frame. This allows the compositor to catch up and unlock both buffers. EGL detects that there are more buffers than currently needed, freeing the buffer, restarting the loop shortly after. To avoid wasting CPU time on rapidly freeing and reallocating color buffers break those feedback loops by letting the unneeded buffers sit around for a short while before disposing them. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Simon Ser <contact@emersion.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14451>	2022-01-10 15:11:44 +00:00
Danylo Piliaiev	d77bfc117c	tu,ir3: Implement VK_KHR_shader_integer_dot_product - gen4 - has dp4acc and dp2acc, dp4acc is used to implement 4x8 dot product. - gen3 - has dp2acc, in OpenCL blob uses dp2acc for dot product on both get3 and gen4. - gen2 - unknown, lower everything. - gen1 - no dp2acc, lower everything. OpenCL blob doesn't advertise cl_qcom_dot_product8 but still generates code for it. The assembly is more verbose and uses yet to be documented mad32.u16 instruction. Passes: dEQP-VK.spirv_assembly.instruction.compute.opsdotkhr.* dEQP-VK.spirv_assembly.instruction.compute.opudotkhr.* dEQP-VK.spirv_assembly.instruction.compute.opsudotkhr.* dEQP-VK.spirv_assembly.instruction.compute.opsdotaccsatkhr.* dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.* dEQP-VK.spirv_assembly.instruction.compute.opsudotaccsatkhr.* Only packed 4x8 unsigned and mixed versions are accelerated. However in theory we should be able to do better for signed version than current NIR lowering. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>	2022-01-10 13:21:24 +02:00
Danylo Piliaiev	e1f89a1da2	ir3: Make nir compiler options a part of ir3_compiler This would allow for sub-gens to have different options. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>	2022-01-10 13:20:39 +02:00
Danylo Piliaiev	b8d486f298	nir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8 Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but not signed dot product. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>	2022-01-10 13:20:39 +02:00
Danylo Piliaiev	c1d5c318bc	ir3: New cat3 instructions * shrm - (src2 >> src1) & src3 * shlm - (src2 << src1) & src3 * shrg - (src2 >> src1) \| src3 * shlg - (src2 << src1) \| src3 * andg - (src2 & src1) \| src3 * dp2acc - dot product of two {i,u}8vec2 packed into SRC1 and SRC2, added to 32b SRC3 * dp4acc - dot product of two {i,u}8vec4 packed into SRC1 and SRC2, added to 32b SRC3 * wmm - vec4(x_1, x_2, x_3, x_4) * (y_1 + y_2 + y_3 + y_4), which is duplicated (1 << (SRC3 / 32)) times starting from DST register * wmm.accu - same as wmm but result is added to DST registers, however the first reg in each vec4 result is overwritten instead of accumulating. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>	2022-01-10 13:20:39 +02:00
Connor Abbott	c45c6e36eb	tu: Implement VK_EXT_subgroup_size_control Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>	2022-01-10 10:58:28 +00:00
Connor Abbott	1a1e25dcce	tu, ir3: Support runtime gl_SubgroupSize in FS We already supported it in the CS for computing the subgroup ID, but soon we'll need it in the FS too. Vertex stages will always have it lowered. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>	2022-01-10 10:58:28 +00:00
Connor Abbott	e6e34883a9	ir3: Add wavesize control This allows the wavesize to be controlled per-shader. This will be used by VK_EXT_subgroup_size_control, and freedreno will also need it if legacy ARB_shader_ballot is to be supported (since it forces a wavesize of 64 or less). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>	2022-01-10 10:58:28 +00:00
Connor Abbott	30237b3d9c	ir3: Pass shader to ir3_nir_post_finalize() We'll need to add shader-specific lowering for gl_SubgroupSize. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>	2022-01-10 10:58:28 +00:00
Connor Abbott	9ebc48005c	ir3, freedreno: Add options struct for ir3_shader_from_nir() We'll expand this in a moment. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>	2022-01-10 10:58:28 +00:00
Danylo Piliaiev	fe9c9ec83f	tu: fix workaround for depth bounds test without depth test Fixes: `bb4db22ff4` ("turnip: apply workaround for depth bounds test without depth test") Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14390>	2022-01-10 09:36:59 +00:00
Lionel Landwerlin	07bc6b7ed9	anv: limit compiler valid color outputs using NIR variables This fixes a test from the vkd3d-proton test_dual_source_blending_dxbc test which asserts in the backend with : brw_fs_visitor.cpp:716: void fs_visitor::emit_fb_writes(): Assertion `!prog_data->dual_src_blend \|\| key->nr_color_regions == 1' failed. This is because there is 2 color attachments provided by the renderpass so we initially set nr_color_regions = 2. But once we've parsed the shader, we can see it's only using one output (with dual source color blending). This change looks at the output variables to update the valid output variables. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14417>	2022-01-10 09:38:32 +02:00
Tapani Pälli	b8f0459d6f	iris: unref syncobjs and free r/w dependencies array for slab entries Fixes memory leak with dependencies array: ==5224== 104 (96 direct, 8 indirect) bytes in 3 blocks are definitely lost in loss record 1,954 of 2,035 ==5224== at 0x484178A: malloc (vg_replace_malloc.c:380) ==5224== by 0x484670B: realloc (vg_replace_malloc.c:1437) ==5224== by 0x14DBAB9B: update_bo_syncobjs (iris_batch.c:819) ==5224== by 0x14DBADB8: update_batch_syncobjs (iris_batch.c:898) ==5224== by 0x14DBB3D5: _iris_batch_flush (iris_batch.c:1031) ==5224== by 0x14DB77D0: iris_transfer_map (iris_resource.c:2348) ==5224== by 0x157786FD: u_transfer_helper_transfer_map (u_transfer_helper.c:243) ==5224== by 0x14C479E7: tc_buffer_map (u_threaded_context.c:2252) ==5224== by 0x1434F3F8: pipe_buffer_map_range (u_inlines.h:393) ==5224== by 0x1435094A: _mesa_bufferobj_map_range (bufferobj.c:491) ==5224== by 0x143586D9: map_buffer_range (bufferobj.c:3737) ==5224== by 0x14358DA3: _mesa_MapBuffer (bufferobj.c:3947) ==5224== 240 (192 direct, 48 indirect) bytes in 6 blocks are definitely lost in loss record 1,984 of 2,035 ==5224== at 0x484178A: malloc (vg_replace_malloc.c:380) ==5224== by 0x484670B: realloc (vg_replace_malloc.c:1437) ==5224== by 0x14DBAB9B: update_bo_syncobjs (iris_batch.c:819) ==5224== by 0x14DBADB8: update_batch_syncobjs (iris_batch.c:898) ==5224== by 0x14DBB3D5: _iris_batch_flush (iris_batch.c:1031) ==5224== by 0x14FF72CC: iris_get_query_result (iris_query.c:631) ==5224== by 0x14C4396A: tc_get_query_result (u_threaded_context.c:880) ==5224== by 0x1458F4F7: get_query_result (st_cb_queryobj.c:273) ==5224== by 0x1458F7EB: st_WaitQuery (st_cb_queryobj.c:352) ==5224== by 0x144EFF66: get_query_object (queryobj.c:742) ==5224== by 0x144F01AE: _mesa_GetQueryObjectuiv (queryobj.c:811) And leak with syncobjs: ==13644== 8 bytes in 1 blocks are definitely lost in loss record 1 of 1,846 ==13644== at 0x484186F: malloc (vg_replace_malloc.c:381) ==13644== by 0x639789B: iris_create_syncobj (iris_fence.c:69) ==13644== by 0x63B213A: iris_batch_reset (iris_batch.c:512) ==13644== by 0x63B3637: _iris_batch_flush (iris_batch.c:1056) ==13644== by 0x65EF2BC: iris_get_query_result (iris_query.c:631) ==13644== by 0x623B970: tc_get_query_result (u_threaded_context.c:880) ==13644== by 0x5B874F7: get_query_result (st_cb_queryobj.c:273) ==13644== by 0x5B877EB: st_WaitQuery (st_cb_queryobj.c:352) ==13644== by 0x5AE7F66: get_query_object (queryobj.c:742) ==13644== by 0x5AE8150: _mesa_GetQueryObjectiv (queryobj.c:801) Fixes: `ce2e2296ab` ("iris: Suballocate BO using the Gallium pb_slab mechanism") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14387>	2022-01-09 13:43:45 +00:00
Christian Gmeiner	9cb91010ab	iris/ci: update piglit fails Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14442>	2022-01-07 23:12:37 +00:00
Christian Gmeiner	4d624f189e	i915g/ci: update piglit fails Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14439>	2022-01-07 23:00:16 +00:00
Emma Anholt	a2dbdc645f	ci: Shrink container/rootfs sizes. Cutting the extra VK mustpass files is 315MB out of 1.5GB of the amd64 rootfs. pip was 10MB. The rustup toolchains were massive (over a GB IIRC) on the x86 container images. Hopefully helps with #5837 Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14460>	2022-01-07 22:32:31 +00:00
Yiwei Zhang	48712b8cc5	venus: subtract appended header size in vn_CreatePipelineCache Use header->header_size to offset cache data as well in case the header struct extends on a newer driver but the cache data was appended with an old header. Fixes: `723f0bf74a` ("venus: initial support for module and pipelines") Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org> Reviewed-by: Chia-I Wu <olvaffe@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14463>	2022-01-07 22:10:53 +00:00
Danylo Piliaiev	3792fbfcf6	ir3: Assert that we cannot have enough concurrent waves for CS with barrier If we have a compute shader that has a big workgroup, a barrier, and a branchstack which limits max_waves - this may result in a situation when we cannot run concurrently all waves of the workgroup, which would lead to a hang. Blob just explodes in such case. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14110>	2022-01-07 18:40:15 +00:00
Danylo Piliaiev	9ed4d49c97	ir3: Be able to reduce register limit for RA when CS has barriers If barriers are used, it must be possible for all waves in the workgroup to execute concurrently. Thus we may have to reduce the registers limit. Fixes a hang in "Digital Combat Simulator". Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14110>	2022-01-07 18:40:15 +00:00
Hoe Hao Cheng	9323d2ea6d	zink/codegen: remove bogus print statement Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14434>	2022-01-07 17:51:45 +00:00
Hoe Hao Cheng	37f01832eb	zink/codegen: remove core_since in constructor the variable is now automatically filled in according to registry values Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14434>	2022-01-07 17:51:45 +00:00
Hoe Hao Cheng	029e871239	zink/codegen: support platform tags Some extensions are locked behind certain platforms, don't include them if the extension is unsupported. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14434>	2022-01-07 17:51:45 +00:00
Lionel Landwerlin	1d40d53e03	anv: don't leave anv_batch fields undefined Because the extend_cb vfunc is not initialized, there is a risk that the emission code calls into a random pointer. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: mesa-stable Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14418>	2022-01-07 17:28:11 +00:00
Gert Wollny	8685a505e7	ntt: Set the output invariant flag according to the semantics This is used by virglrenderer to create the correct shaders on the host. Fixes: dEQP-GLES31.functional.primitive_bounding_box.triangles.tessellation_set_per_primitive.vertex_tessellation_fragment.fbo when using ntt with virgl. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423>	2022-01-07 16:35:43 +00:00
Gert Wollny	6f348d9c99	nir_lower_io: propagate the "invariant" flag to outputs Ultimately this is consumed by nir-to-tgsi and needed by virglrenderer to correctly declare output variables. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423>	2022-01-07 16:35:43 +00:00
Gert Wollny	5bfe292708	util/primconvert: map only index buffer part that is needed By putting vertex store and indices all in one buffer the larger part of the shared buffer might actually only be vertex data we are not interested in. Hence only map the part of the buffer that contains the index data for the currently active draw command. This helps drivers where a mapping operation is expensive, like e.g. virgl. v2: - add comment about ranged buffer mapping (Pierre-Eric) - keep passing direct_draws[i].start to direct_draw_func, it looks like the "start" parameter is properly set in util_prim_restart_convert_to_direct v3: Fix ws error (Mike) Related: #5825 Fixes: `f9d12bf50e` vbo/dlist: use a single buffer object Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14423>	2022-01-07 16:35:43 +00:00
Christian Gmeiner	86b19db459	etnaviv/ci: update piglit fails Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14441>	2022-01-07 16:24:12 +00:00
Rhys Perry	1756930a79	radv: increase maxTaskOutputCount to 65535 This is the minimum required by the spec. Fixes dEQP-VK.api.info.vulkan1p2_limits_validation.nv_mesh_shader Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14446>	2022-01-07 15:04:11 +00:00
Connor Abbott	cb45120556	ir3: Use (ss) for instructions writing shared regs The blob uses both nops and (ss). It turns out that in some rare cases the hardware does take more than 6 cycles, at least for movmsk, but adding nops is unnecessary. I believe the extra nops are only there due to the immaturity of the blob's implementation of subgroup ops, so we don't have to copy them - just handle shared reg producers the same as SFU instructions. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>	2022-01-07 14:26:08 +00:00
Connor Abbott	d45678cac4	ir3/postsched: Rename tex/sfu to sy/ss Analogous to the previous commit. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>	2022-01-07 14:26:08 +00:00
Connor Abbott	e6b35d606d	ir3/sched: Rename tex/sfu to sy/ss This now covers e.g. cat6 instructions as well, and ss will cover instructions writing shared regs as well. This is split out from the previous change to avoid too much churn and shouldn't cause any functional changes. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>	2022-01-07 14:26:08 +00:00
Connor Abbott	0cc4aca345	ir3: Use new (sy)/(ss) stall helpers in the compiler This fixes a few bad assumptions in the pre-RA and post-RA scheduler, for example that (sy) is only for texture instructions and (ss) is only for SFU instructions and (sy) and (ss) producers will always take the same number of cycles. This means we now start doing latency hiding for cat6 instructions like ldib and ldc. It also should make us hide latency more aggressively, since the number used for (sy) stall cycles was way lower than the real numbers for everything except ldc. Finally it unifies the various places (ss) soft nops were calculated. selected shader-db results: total nops in shared programs: 345278 -> 358959 (3.96%) nops in affected programs: 215622 -> 229303 (6.34%) helped: 690 HURT: 2430 helped stats (abs) min: 1 max: 125 x̄: 11.40 x̃: 5 helped stats (rel) min: 0.53% max: 100.00% x̄: 24.19% x̃: 18.52% HURT stats (abs) min: 1 max: 501 x̄: 8.87 x̃: 5 HURT stats (rel) min: 0.00% max: 9900.00% x̄: 52.36% x̃: 14.29% 95% mean confidence interval for nops value: 3.78 4.99 95% mean confidence interval for nops %-change: 28.21% 42.66% Nops are HURT. total mov in shared programs: 75049 -> 74110 (-1.25%) mov in affected programs: 15754 -> 14815 (-5.96%) helped: 566 HURT: 455 helped stats (abs) min: 1 max: 36 x̄: 4.52 x̃: 3 helped stats (rel) min: 0.83% max: 100.00% x̄: 35.85% x̃: 30.00% HURT stats (abs) min: 1 max: 35 x̄: 3.55 x̃: 3 HURT stats (rel) min: 0.00% max: 1100.00% x̄: 63.60% x̃: 25.00% 95% mean confidence interval for mov value: -1.25 -0.58 95% mean confidence interval for mov %-change: 2.92% 14.02% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total last-baryf in shared programs: 80468 -> 67670 (-15.90%) last-baryf in affected programs: 63676 -> 50878 (-20.10%) helped: 309 HURT: 147 helped stats (abs) min: 1 max: 260 x̄: 49.20 x̃: 24 helped stats (rel) min: 0.60% max: 98.81% x̄: 37.92% x̃: 40.91% HURT stats (abs) min: 1 max: 115 x̄: 16.35 x̃: 12 HURT stats (rel) min: 0.96% max: 1933.33% x̄: 45.55% x̃: 7.89% 95% mean confidence interval for last-baryf value: -33.03 -23.10 95% mean confidence interval for last-baryf %-change: -21.52% -0.50% Last-baryf are helped. total sstall in shared programs: 133997 -> 126398 (-5.67%) sstall in affected programs: 86866 -> 79267 (-8.75%) helped: 1893 HURT: 598 helped stats (abs) min: 1 max: 77 x̄: 6.06 x̃: 4 helped stats (rel) min: 0.71% max: 100.00% x̄: 32.82% x̃: 16.67% HURT stats (abs) min: 1 max: 65 x̄: 6.47 x̃: 6 HURT stats (rel) min: 0.00% max: 900.00% x̄: 65.51% x̃: 25.00% 95% mean confidence interval for sstall value: -3.39 -2.71 95% mean confidence interval for sstall %-change: -12.19% -6.24% Sstall are helped. total systall in shared programs: 350304 -> 288234 (-17.72%) systall in affected programs: 234855 -> 172785 (-26.43%) helped: 1456 HURT: 260 helped stats (abs) min: 1 max: 574 x̄: 46.42 x̃: 27 helped stats (rel) min: 0.19% max: 100.00% x̄: 39.43% x̃: 36.06% HURT stats (abs) min: 1 max: 757 x̄: 21.20 x̃: 8 HURT stats (rel) min: 0.00% max: 180.95% x̄: 24.82% x̃: 12.50% 95% mean confidence interval for systall value: -39.31 -33.03 95% mean confidence interval for systall %-change: -31.49% -27.90% Systall are helped. total waves in shared programs: 236732 -> 235142 (-0.67%) waves in affected programs: 6142 -> 4552 (-25.89%) helped: 535 HURT: 17 helped stats (abs) min: 2 max: 8 x̄: 3.08 x̃: 2 helped stats (rel) min: 12.50% max: 75.00% x̄: 28.78% x̃: 25.00% HURT stats (abs) min: 2 max: 6 x̄: 3.53 x̃: 4 HURT stats (rel) min: 16.67% max: 75.00% x̄: 37.35% x̃: 33.33% 95% mean confidence interval for waves value: -3.04 -2.72 95% mean confidence interval for waves %-change: -28.10% -25.39% Waves are helped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>	2022-01-07 14:26:08 +00:00
Connor Abbott	7e60978d30	ir3: Introduce systall metric and new helper functions Add new centralized functions which will replace the various places we hardcode 10 for the number of (ss) nops, add numbers for soft (sy) nops based on similar computerator experiments with ldc, sam, and ldib (the most common (sy) producers), and add a "systall" metric which is analogous to sstall. This also fixes some cases where we'd erroniously count ldl* as (sy) producers instead of (ss) producers when calculating sstall. This only switches over the metric reporting to the new functions, so there is no behavior change. The following commit will switch over the rest of the compiler. While we're at it, remove max_sun as it's never set. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>	2022-01-07 14:26:08 +00:00
Connor Abbott	603791bdeb	ir3: Bump type mismatch penalty to 3 After some experimentation with computerator, it seems on a618 that writing a full register and then reading half of it as a half register requires a delay of 6, the same as the delay for cat5/cat6 sources. The other direction only has a delay of 5, but just bump it unconditionally out of an abundance of caution. Fixes: `890de1a436` ("ir3/delay: Fix full->half and half->full delay") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>	2022-01-07 14:26:08 +00:00
Connor Abbott	d371d807eb	ir3/ra: Fix logic bug in compress_regs_left If we're allocating a source then we force is_killed to false, not to true. Fixes a regression in dEQP-GLES31.functional.synchronization.in_invocation.image_atomic_write_read later. Fixes: `0ffcb19b9d` ("ir3: Rewrite register allocation") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14246>	2022-01-07 14:26:08 +00:00
Tomeu Vizoso	c9adcb6051	anv/tests: Free BO cache and device mutex Was getting ASAN errors in CI when trying to add ANV to the debian-testing job: ==10993==ERROR: LeakSanitizer: detected memory leaks Direct leak of 4194304 byte(s) in 64 object(s) allocated from: #0 0x7f763c1bda3c in __interceptor_posix_memalign ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:226 #1 0x55f43d28627f in os_malloc_aligned ../src/util/os_memory_aligned.h:58 #2 0x55f43d28627f in _util_sparse_array_node_alloc ../src/util/sparse_array.c:107 #3 0x55f43d28627f in util_sparse_array_get ../src/util/sparse_array.c:143 #4 0x55f43d1fdaba in anv_device_lookup_bo ../src/intel/vulkan/anv_private.h:1335 #5 0x55f43d1fdaba in anv_device_import_bo_from_host_ptr ../src/intel/vulkan/anv_allocator.c:1843 #6 0x55f43d1ff571 in anv_block_pool_expand_range ../src/intel/vulkan/anv_allocator.c:534 #7 0x55f43d1ffcb5 in anv_block_pool_init ../src/intel/vulkan/anv_allocator.c:417 #8 0x55f43d18f082 in run_test ../src/intel/vulkan/tests/block_pool_no_free.c:123 #9 0x55f43d1862b6 in main ../src/intel/vulkan/tests/block_pool_no_free.c:152 #10 0x7f763b942d09 in __libc_start_main ../csu/libc-start.c:308 Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14121>	2022-01-07 13:33:32 +00:00
Tomeu Vizoso	8a7659a7a2	anv/ci: Test with deqp-vk on Tiger Lake Run half of the CTS in 10 Volteer Chromebook devices. Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14121>	2022-01-07 13:33:32 +00:00
Jesse Natalie	ef27036b4c	shader_info: tess.spacing needs to be unsigned Otherwise MSVC will treat the bit-packed enum values as signed. Reviewed-by: Marek Olák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14402>	2022-01-07 12:16:41 +00:00
Philipp Zabel	1b808f1dea	etnaviv: fix emit_if in case the else block ends in a jump Fixes piglit test shaders@ssa@fs-if-def-else-break. Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de> Reviewed-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12892>	2022-01-07 12:02:39 +00:00
Rohan Garg	af13119993	intel/fs: OpImageQueryLod does not support arrayed images as an operand When we lower SPIR-V to NIR for textures in vtn_handle_texture, we only bump the number of coordinate components when the op is not a lod query. Update the assert to take this into account. This fixes: - dEQP-VK.robustness.robustness2.bind.template.r32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.r32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rg32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.bind.template.rgba32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.r32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rg32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32f.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32i.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32i.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32ui.dontunroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag - dEQP-VK.robustness.robustness2.push.notemplate.rgba32ui.unroll.nonvolatile.sampled_image.no_fmt_qual.null_descriptor.samples_1.cube_array.frag Fixes: `231337a1` ("intel/fs/xehp: Assert that the compiler is sending all 3 coords for cubemaps.") Signed-off-by: Rohan Garg <rohan.garg@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13925>	2022-01-07 10:53:35 +00:00
Emma Anholt	558a600629	nir_to_tgsi: Enable fdot_replicates flag. That's how the TGSI math opcodes work. This lets lower_vec_to_regs coalesce the DP output into the .yzw channels, giving an impressive shader-db win on softpipe: total instructions in shared programs: 2929840 -> 2794036 (-4.64%) instructions in affected programs: 1651438 -> 1515634 (-8.22%) total temps in shared programs: 372730 -> 332744 (-10.73%) temps in affected programs: 118151 -> 78165 (-33.84%) and a minor one on r300: total instructions in shared programs: 51238 -> 51149 (-0.17%) instructions in affected programs: 2621 -> 2532 (-3.40%) total vinst in shared programs: 15655 -> 15618 (-0.24%) vinst in affected programs: 468 -> 431 (-7.91%) total temps in shared programs: 9838 -> 9828 (-0.10%) temps in affected programs: 59 -> 49 (-16.95%) and a bigger one on i915g: total instructions in shared programs: 398064 -> 395901 (-0.54%) instructions in affected programs: 29271 -> 27108 (-7.39%) total tex_indirect in shared programs: 12261 -> 12233 (-0.23%) tex_indirect in affected programs: 98 -> 70 (-28.57%) LOST: 0 GAINED: 5 The r300 change is less impressive because it does some backend copy-prop, but also because intermediate storage of DPs now takes a vec4 instead of a scalar. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14200>	2022-01-07 09:58:24 +00:00
Christian Gmeiner	85d7d520b9	panfrost/ci: update piglit fails Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14428>	2022-01-07 09:06:20 +00:00
Francisco Jerez	054eb9f346	intel/dev: Implement DG2 restrictions requiring additional DSSes to be disabled. Note that this causes a geometry slice to be disabled if any DSS is fused off within that slice, which may seem stricter than the BSpec quotation implies, but testing shows that pixel pipes with any faulted DSS don't work at all, and that using a slice with any faulted pixel pipe leads to serious graphics corruption. It would be better to query this geometry topology information from the hardware instead of trying to reconstruct it here, but the kernel interface for that is not available yet. Reviewed-by: Caio Oliveira <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14436>	2022-01-07 07:58:27 +00:00

... 3 4 5 6 7 ...

148843 Commits All Branches Search

148843 Commits

All Branches