KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Danylo Piliaiev	1a2f1e3f47	turnip: fill VkMemoryDedicatedRequirements We support VK_KHR_dedicated_allocation so we must fill VkMemoryDedicatedRequirements. Vulkan spec states: "[...] requiresDedicatedAllocation may be VK_TRUE under one of the following conditions: The pNext chain of VkImageCreateInfo for the call to vkCreateImage used to create the image being queried included a VkExternalMemoryImageCreateInfo structure, and any of the handle types specified in VkExternalMemoryImageCreateInfo::handleTypes requires dedicated allocation, as reported by vkGetPhysicalDeviceImageFormatProperties2 in VkExternalImageFormatProperties::externalMemoryProperties.externalMemoryFeatures, the requiresDedicatedAllocation field will be set to VK_TRUE." All handle types require dedicated allocation at the moment. Fixes: dEQP-VK.api.external.memory.opaque_fd.dedicated.image.info dEQP-VK.memory.requirements.dedicated_allocation.buffer.regular dEQP-VK.memory.requirements.dedicated_allocation.image.transient_tiling_optimal Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9086>	2021-03-12 11:56:47 +02:00
Tapani Pälli	d7b3454af3	anv: fix compilation due to missing vk_format_from_android Fixes: `4fb6c051c9` ("anv: Move vk_format helpers to common code") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4428 Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9549>	2021-03-12 10:35:01 +02:00
Tapani Pälli	0759822f64	anv/android: fix compilation failure Fixes: `3e6d3bca1d` ("anv/android: Fix size check for imported gralloc bo") Signed-off-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9549>	2021-03-12 10:34:53 +02:00
Dave Airlie	49bb53ba43	lavapipe: add EXT_sampler_filter_minmax support Hook up the extension Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9423>	2021-03-12 16:02:30 +10:00
Dave Airlie	6adbf6c86c	llvmpipe: add reduction mode support Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9423>	2021-03-12 16:02:25 +10:00
Dave Airlie	1fb43ae9bf	lavapipe: enable KHR_multiview Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	cbd01045bc	lavapipe: add render pass support for multiview Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	3c08eee1bd	lavapipe: add input attachment support for multiview Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	8c6d4d470e	lavapipe: add draw support for multiview Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	4d72515e32	lavapipe: add clear support for multiview Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	e81cd37363	llvmpipe: add view index support to rasterizer Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	b76242b9c8	llvmpipe: add the view index callback from draw This just stores the view index into setup Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	b5f686c93b	draw: add tess/gs support for multiview index Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	a2bee6df5f	draw/vs: pass the view index to the vertex shader Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	a417843a3c	draw: pass the view index to the render driver Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	03cbb7b104	draw: add view_mask rendering support This loops the draws per-view above the instance rendering Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	b10b55f3d3	draw: refactor out the instances drawing code This can be reused nice for multiview if refactored out Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	267d216bcb	draw: add interface to notify renderer of the current view index Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	9f0fd85474	gallivm: add support for load_view_index intrinsic This just adds the system value Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	974f2e6c6a	gallivm: mark subpass input attachments as 2d arrays This matters when multiview is enabled. Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Dave Airlie	e3b8f449e1	gallium: add a view mask to the draw command This allows the caller to specify the view mask for this draw in a multiview draw environment This has been packed into the upper nibble and 2 bits of the index size to retain the struct size as small as possible for tc. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9399>	2021-03-12 05:05:51 +00:00
Timothy Arceri	684f97de80	glsl: fix declarations of gl_MaxVaryingFloats gl_MaxVaryingFloats was not removed from core until 4.20 and is still available in compat shaders. Found while writing some new CTS to test the correct declarations of this constant. Fixes: 0ebf4257a385i ("glsl: define some GLES3 constants in GLSL 4.1") Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9514>	2021-03-12 04:30:32 +00:00
Jason Ekstrand	6d16d929f3	iris: Add an iris_write_reg macro Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9537>	2021-03-12 04:17:39 +00:00
Jason Ekstrand	5b792d79a4	anv: Add an anv_batch_write_reg macro Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9537>	2021-03-12 04:17:39 +00:00
Jason Ekstrand	5f192b190f	anv,genxml: Handle L3SQCREG1_SQGHPCI in GenXML Technically, this is only one field on IVB but it's two on BYT and so it makes things easier if we split it for all Gen7. While we're here, make some of the other fields in L3SQCREG1 Booleans. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9537>	2021-03-12 04:17:39 +00:00
Ilia Mirkin	987fef5f0e	nvc0: enable minmax reductions on gm200+ Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9487>	2021-03-12 00:05:24 +00:00
Ilia Mirkin	41aad1c120	st/mesa: add EXT_texture_filter_minmax support This also trivially adds ARB_texture_filter_minmax, since the EXT variant is a strict superset. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9487>	2021-03-12 00:05:24 +00:00
Ilia Mirkin	6384dcaf7c	mesa: add tracking of reduction mode This is used to expose ARB/EXT_texture_filter_minmax. Note that only the EXT_* enable is provided since the ARB one would require proper handling of some formats not being supported. For now this is force-enabled for everything. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9487>	2021-03-12 00:05:24 +00:00
Dave Airlie	7c999249ef	gallium: add a sampler reduction cap + settings This is to allow for VK_EXT_sampler_filter_minmax GL_EXT_texture_filter_minmax support Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9487>	2021-03-12 00:05:24 +00:00
Michael Tang	8016a098fc	microsoft/spirv_to_dxil: Fix spirv2dxil I/O to use binary mode Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9513>	2021-03-11 23:43:47 +00:00
Michael Tang	d4a51160ad	util: Make os_read_file use O_BINARY on Windows Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9513>	2021-03-11 23:43:47 +00:00
Eric Anholt	5785fdac63	u_format: Mark the generated pack/unpack src/dst args as restrict. Calling code to pack/unpack with overlap would be already be undefined. Cuts 50k of text on x86_64 release builds from the compiler having more freedom in the src/dst loads knowing that they don't interfere with each other. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Adam Jackson <ajax@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9500>	2021-03-11 23:26:34 +00:00
Anuj Phogat	d4b231bb80	intel/isl: Drop intel_ prefix in function names This change is in line with naming convention used in isl. We want to keep intel_ prefix reserved for common code. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9532>	2021-03-11 23:01:56 +00:00
Ian Romanick	da7389eced	nir/range_analysis: Simplify analysis of bcsel union_ranges was previously guarded by 'ifndef NDEBUG'. After removing that, I noticed that the two tables were identical. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Ian Romanick	7019cd84c0	nir/search: Use range analysis for is_finite There are only a couple patterns that use is_finite, so the changes aren't huge. Mostly shaders from Batman Arkham City and a few shaders from Shadow of the Tomb Raider were affected. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake Instructions in all programs: 160902591 -> 160902489 (-0.0%) SENDs in all programs: 6812270 -> 6812270 (+0.0%) Loops in all programs: 38225 -> 38225 (+0.0%) Cycles in all programs: 7429003266 -> 7428992369 (-0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304539 -> 304539 (+0.0%) Ice Lake Instructions in all programs: 145301634 -> 145301460 (-0.0%) SENDs in all programs: 6863890 -> 6863890 (+0.0%) Loops in all programs: 38219 -> 38219 (+0.0%) Cycles in all programs: 8798589772 -> 8798575869 (-0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334250 -> 334250 (+0.0%) Skylake Instructions in all programs: 135892010 -> 135891836 (-0.0%) SENDs in all programs: 6802916 -> 6802916 (+0.0%) Loops in all programs: 38216 -> 38216 (+0.0%) Cycles in all programs: 8442597324 -> 8442583202 (-0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301116 -> 301116 (+0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Ian Romanick	f4a7dbc58f	nir/range_analysis: Fix analysis of fmin, fmax, or fsat with NaN source Recall that when either value is NaN, fmax will pick the other value. This means the result range of the fmax will either be the "ideal" result range (calculated above) or the range of the non-NaN value. Previously, something like fmax({gt_zero}, {lt_zero, is_a_number}) would return a range of gt_zero. However, if the "gt_zero" parameter is NaN, the actual result will be the "lt_zero" parameter. This analysis depends on the is_a_number analysis also added in this MR. Assuming this doesn't cause any unforeseen problems, I believe we should wait a bit, then nominate a subset of the series for the stable branches. This fixes the piglit tests tests/spec/glsl-1.30/execution/range_analysis_fmax_of_nan.shader_test tests/spec/glsl-1.30/execution/range_analysis_fmin_of_nan.shader_test from https://gitlab.freedesktop.org/mesa/piglit/-/merge_requests/463. Even with the added fsat fixes, range_analysis_fsat_of_nan.shader_test still fails. There are some other issues there that will be addressed in later commits (in another MR). v2: Add fsat fixes. Suggested by Rhys. Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Shader-db results: All Intel platforms had similar results. (Tiger Lake shown) total instructions in shared programs: 21049290 -> 21049314 (<.01%) instructions in affected programs: 3175 -> 3199 (0.76%) helped: 0 HURT: 17 HURT stats (abs) min: 1 max: 3 x̄: 1.41 x̃: 1 HURT stats (rel) min: 0.20% max: 1.89% x̄: 0.97% x̃: 0.92% 95% mean confidence interval for instructions value: 1.09 1.73 95% mean confidence interval for instructions %-change: 0.75% 1.19% Instructions are HURT. total cycles in shared programs: 855136176 -> 855136406 (<.01%) cycles in affected programs: 37579 -> 37809 (0.61%) helped: 0 HURT: 17 HURT stats (abs) min: 12 max: 20 x̄: 13.53 x̃: 14 HURT stats (rel) min: 0.17% max: 1.13% x̄: 0.79% x̃: 0.91% 95% mean confidence interval for cycles value: 12.53 14.53 95% mean confidence interval for cycles %-change: 0.63% 0.94% Cycles are HURT. Fossil-db results: Tiger Lake Instructions in all programs: 160901033 -> 160902591 (+0.0%) SENDs in all programs: 6812270 -> 6812270 (+0.0%) Loops in all programs: 38225 -> 38225 (+0.0%) Cycles in all programs: 7430016795 -> 7429003266 (-0.0%) Spills in all programs: 192582 -> 192582 (+0.0%) Fills in all programs: 304539 -> 304539 (+0.0%) Ice Lake Instructions in all programs: 145299102 -> 145301634 (+0.0%) SENDs in all programs: 6863890 -> 6863890 (+0.0%) Loops in all programs: 38219 -> 38219 (+0.0%) Cycles in all programs: 8798390846 -> 8798589772 (+0.0%) Spills in all programs: 216880 -> 216880 (+0.0%) Fills in all programs: 334250 -> 334250 (+0.0%) Skylake Instructions in all programs: 135889478 -> 135892010 (+0.0%) SENDs in all programs: 6802916 -> 6802916 (+0.0%) Loops in all programs: 38216 -> 38216 (+0.0%) Cycles in all programs: 8442624166 -> 8442597324 (-0.0%) Spills in all programs: 194839 -> 194839 (+0.0%) Fills in all programs: 301116 -> 301116 (+0.0%) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Ian Romanick	aa5d38decd	nir/range_analysis: Add "is a number" range analysis tracking This commit is necessary to support "nir/range_analysis: Fix analysis of fmin and fmax with NaN". No shader-db or fossil-db changes on any Intel platform. v2: Pack and unpack is_a_number. v3: Don't set is_a_number of integer constants. The bit pattern might be NaN. v4: Update handling of b2i32. intBitsToFloat(int(true)) is 1.401298464324817e-45. Return a value consistent with that. Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Ian Romanick	d4f21b53f2	nir/range_analysis: Add "is finite" range analysis tracking The obvious changes to nir_search_helpers.h are in a separate commit to limit the scope of this change. These additions are really only needed to support the next commit "nir/range_analysis: Add "is a number" range analysis tracking". This reduction in scope is intended to increase the suitability for stable branches. No shader-db or fossil-db changes on any Intel platform. v2: Pack and unpack is_finite. v3: Split nir_search_helpers.h changes into a separate commit. v4: Remove assertion intended for the next commit. Update is_finite comment for fsign. Both noticed by Rhys. Fix is_finite handling for load_const vectors. If any element is not finite, set the flag to false. This is the same way is_integral is already handled. v5: Update handling of b2i32. intBitsToFloat(int(true)) is 1.401298464324817e-45. Return a value consistent with that. Fixes: `405de7ccb6` ("nir/range-analysis: Rudimentary value range analysis pass") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Ian Romanick	86fb53b1be	nir/range_analysis: Refactor fsat handling This will greatly simplify a later commit. The assert(r.is_integral) in the eq_zero case is dropped because I don't think it's useful anymore. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9108>	2021-03-11 22:00:30 +00:00
Axel Davy	767270e809	st/nine: Check memfd_create support glibc introduced memfd_create only in its 2.27 release. Check memfd_create support by verifying HAVE_MEMFD_CREATE is defined. Fixes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9377 Reported by Roman Elshin in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9451 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9483>	2021-03-11 21:29:51 +00:00
Danylo Piliaiev	ae3b95daa7	turnip: lower device index to zero Vulkan 1.1 has VK_KHR_device_group and VK_KHR_device_group_creation promoted to core, thus we should handle DeviceIndex built-in. While we are here, also add these extensions to the extensions list, even though they are not doing anything useful. Fixes test: dEQP-VK.compute.device_group.device_index Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9516>	2021-03-11 21:12:52 +00:00
Connor Abbott	ee1f140fd9	freedreno/a6xx: Cleanup SP_XS_CTRL_REG0 definitions The registers were actually different per-stage even though we used the same type, which resulted in a bunch of incorrectly programmed fields and confusion. Move the stage-specific values to the registers themselves, which makes things much less confusing and makes it possible to set "mergedregs" correctly. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9493>	2021-03-11 20:58:39 +00:00
Connor Abbott	9a5596d679	freedreno/registers: Handle typed registers with fields When a bitset is "inline" it should act as-if the its fields were inserted into the register itself. However when initializing the register's bitfield we weren't doing a deep copy of the inline bitfield, so if the register defined additional fields then they would get added to the original inline bitfield and any further registers with the same type would get them. Fix this. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9493>	2021-03-11 20:58:39 +00:00
Connor Abbott	8d55a1e112	freedreno/a6xx: Fix compute threadsize type And use the variable for the other threadsize field. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9493>	2021-03-11 20:58:39 +00:00
Connor Abbott	1d8bf2d0bf	freedreno/computerator: Fix thrsz type And use it for the other thread size field, too Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9493>	2021-03-11 20:58:39 +00:00
Lionel Landwerlin	f3cf70dc8d	intel/tools: fix meson warning Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4434 Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9524>	2021-03-11 20:52:20 +00:00
Pierre Moreau	4a408ff7ea	spirv: Ignore WorkgroupSize in non-compute stages If a SPIR-V module contains for example both a geometry and a compute shader, when processing the geometry shader its vertices out, input primitive and output primitive attributes would get overwritten by the value of the WorkgroupSize. ``` ; SPIR-V ; Version: 1.5 ; Generator: Khronos; 17 ; Bound: 12 ; Schema: 0 OpCapability Geometry OpCapability Shader %1 = OpExtInstImport "GLSL.std.450" OpMemoryModel Logical GLSL450 OpEntryPoint Geometry %main "main" OpEntryPoint GLCompute %main_0 "main" OpExecutionMode %main InputPoints OpExecutionMode %main Invocations 1 OpExecutionMode %main OutputTriangleStrip OpExecutionMode %main OutputVertices 4 OpExecutionMode %main_0 LocalSize 1 1 1 OpSource GLSL 460 OpSource GLSL 460 OpName %main "main" OpName %main_0 "main" OpModuleProcessed "Linked by SPIR-V Tools Linker" OpDecorate %gl_WorkGroupSize BuiltIn WorkgroupSize %void = OpTypeVoid %6 = OpTypeFunction %void %uint = OpTypeInt 32 0 %v3uint = OpTypeVector %uint 3 %uint_1 = OpConstant %uint 1 %gl_WorkGroupSize = OpConstantComposite %v3uint %uint_1 %uint_1 %uint_1 %main = OpFunction %void None %6 %10 = OpLabel OpReturn OpFunctionEnd %main_0 = OpFunction %void None %6 %11 = OpLabel OpReturn OpFunctionEnd ``` Running spirv_to_nir on the SPIR-V sample above and for the geometry entry point would say that (among others): * vertices out: 1 * input primitive: LINES * output primitive: LINES By removing any reference to `%gl_WorkGroupSize`, the output would change to (among others): * vertices out: 4 * input primitive: POINTS * output primitive: TRIANGLE_STRIP Fixes: `7d862ef530` ("spirv: Rework handling of spec constant workgroup size built-ins") v2: * Move the check from inside `handle_workgroup_size_decoration_cb()` to its caller (Caio Marcelo de Oliveira Filho ) * Add an assert on the shader stage before using `workgroup_size_builtin` (Caio Marcelo de Oliveira Filho ) Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Signed-off-by: Pierre Moreau <dev@pmoreau.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9418>	2021-03-11 20:30:38 +00:00
Anuj Phogat	9d95e1bd79	i965: Rename files with "intel_" prefix to "brw_" v2: Rename intel_batchbuffer.c to intel_batch.c and intel_batchbuffer.h to intel_batch.h Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9510>	2021-03-11 10:14:33 -08:00
Anuj Phogat	3096788e5c	i965: Remove blank line at EOF Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9510>	2021-03-11 09:43:03 -08:00
Rhys Perry	38b2e13766	aco: remove vmem/smem score statistics Replaced by the Latency statistic. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	a0243f5c47	aco: add ACO_DEBUG=perfinfo This prints the program with each instruction's contribution to it's latency and various factors for the calculation of the Inverse Throughput statistic. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	5d6a1095bf	aco: add print option to print program without temporary IDs Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	23ecceb160	aco: add latency and inverse throughput statistics Latency is estimanted duration of a single wave, ignoring others in the CU. It is similar to the old cycles statistic except it it's more accurate and considers memory operations. The InvThroughput statistic is a combination of MaxWaves, Latency and the portion of the wave's execution which does not use various resources. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	83ce9407f2	aco: add instruction classes These should mostly match LLVM. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	0af7ff49fd	aco: lower p_constaddr into separate instructions earlier This allows them to be scheduled properly and simplifies the assembler a little. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 16:31:19 +00:00
Rhys Perry	ab957bb899	aco: move wait_imm to aco_ir.h Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 15:35:34 +00:00
Rhys Perry	7d5643c0fe	aco: track divergent and uniform branch depth Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 15:35:30 +00:00
Rhys Perry	8f71be0a7b	aco: simplify loop_nest_depth tracking in isel Keep track of the current loop depth in Program and set the depth inside Program::insert_block() instead of repeating it every time we insert one. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8994>	2021-03-11 15:35:24 +00:00
Boris Brezillon	442fbcdb47	panfrost: Expose panfrost_modifier_to_layout() Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9517>	2021-03-11 15:10:58 +00:00
Boris Brezillon	825b1f9446	panfrost: Split the sampler and texture count The texture and sampler descriptors are well separated in Vulkan, let's add a new field to allow mixing sampler and texture descs. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9517>	2021-03-11 15:10:58 +00:00
Boris Brezillon	b0f968cf5c	panfrost: Don't count the special vertex/instance ID attributes on Bifrost On Bifrost the vertex/instance ID are preloaded in special registers, no need to add special attribute entries. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9517>	2021-03-11 15:10:58 +00:00
Boris Brezillon	7b9dfc502a	panfrost: Print the correct UBO size when dumping UBO information There's a minus(1) modifier on the entries field. Take it into account. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9517>	2021-03-11 15:10:58 +00:00
Boris Brezillon	3559efb9bf	panfrost: Allow passing an explicit UBO index for the sysval UBO UBO index assignment is a bit special in Vulkan, it's based on the descriptor set layout, which doesn't know about shaders' internal UBOs (our sysval UBOs). Extend the backend compilers so we can place sysval UBOs where we want: after all explicit UBOs. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9517>	2021-03-11 15:10:58 +00:00
Boris Brezillon	92d9f090d9	panfrost: Add a knob to disable the UBO -> push constants optimization I'm just too lazy to implement the logic to prepare push constant buffers in the Vulkan driver. Besides, Vulkan has explicit push constants, which AFAIK is not handled in the compiler backends yet, and that will probably conflict with the UBO -> push constant promotion. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9517>	2021-03-11 15:10:57 +00:00
Lucas Stach	2229328cf9	renderonly: close the gpu fd when destroying renderonly Currently the screen destruction closes the dup'ed fd, but not the original renderonly gpu fd, which is kept around for the lifetime of the renderonly. Squashed revert of "vc4: Don't leak the GPU fd for renderonly usage." (commit `99ef66c325`) as requested by Eric. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6983>	2021-03-11 14:41:48 +00:00
Lucas Stach	187218395d	renderonly: remove layering violations The renderonly object is something the winsys creates, so the pipe driver has no business in memcpying or freeing it. Move those bits to the winsys. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6983>	2021-03-11 14:41:48 +00:00
Alyssa Rosenzweig	5487847d8c	pan/bi: Implement u{add, sub}_sat Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9520>	2021-03-11 14:30:19 +00:00
Boris Brezillon	3c7634f7d2	pan/bi: Extend the bi_builder to support type variants correctly Some opcodes come with both type and size variants. Right now, only the size is taken into account. Extend the builder to provide wrappers that take a nir_type in addition to the bitsize. While at it, fix wrappers taking a compare operator to use the proper .{i,s,u} variant based on the comparison (equal and non-equal should use .i, other comparisons should use .{u,s}). Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9520>	2021-03-11 14:30:19 +00:00
Boris Brezillon	0113a0a1ee	panfrost: Move pan_special_varying definition to pan_encoder.h Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9520>	2021-03-11 14:30:19 +00:00
Boris Brezillon	1f99bba06e	panfrost: Add a pan_section_offset() helper Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9520>	2021-03-11 14:30:19 +00:00
Boris Brezillon	1758da0a7e	panfrost: Allow passing an explicit global dependency when queuing a job We will have 2 compute jobs per indexed indirect draw, one doing the min-max index search and one patching the cmdstream. The second compute job needs to depend on the first one, as well as the previous indirect draw job to avoid corrupting the indirect draw context which is shared at the batch level (global dependency). Instead of handling that case in panfrost_add_job(), extend panfrost_add_job() to accept an explicit global dependency. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9520>	2021-03-11 14:30:19 +00:00
Boris Brezillon	0bb091fd7c	panfrost: Add a parameter to suppress next job prefetching This is needed for indirect draws so the compute job can patch the vertex/tiler jobs which are following in the chain. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9520>	2021-03-11 14:30:19 +00:00
Boris Brezillon	00b85a0aaf	panfrost: Split the direct and indirect draw logic Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9520>	2021-03-11 14:30:19 +00:00
Boris Brezillon	691c47dd6c	pan/bi: Move int64 lowering before idiv lowering Otherwise all 64 divisions will be skipped. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9520>	2021-03-11 14:30:19 +00:00
Boris Brezillon	f7bbfbaeb5	Revert "pan/bi: Optimize out redundant jumps to #0x0" A block that has all its successors empty is not necessarily a leaf block in the CFG, and removing the JUMP in that causes the shader to continue executing code from another block instead of exiting. This reverts commit `a496b41d50`. Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9520>	2021-03-11 14:30:19 +00:00
Rhys Perry	35fe62dad1	radv/llvm: fix enabled_channels for compressed exports The old values seemed to work fine, but the ISA docs recommend 0x0,0x3,0xc and 0xf: COMPR==1: export half-dword enable. Valid values are: 0x0,3,c,f [0] enables VSRC0 : R,G from one VGPR (R in low bits, G high) [2] enables VSRC1 : B,A from one VGPR (B in low bits, A high) Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9459>	2021-03-11 13:54:18 +00:00
Rhys Perry	341dd9d834	aco: set compr for fp16 exports Obviously this didn't affect correctness. Not sure about performance. It also changes enabled_channels to match radeonsi. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `f29c81f863` ("aco: use VOP2 for v_cvt_pkrtz_f16_f32 if possible") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9459>	2021-03-11 13:54:18 +00:00
Marek Olšák	e6a0f243ea	radeonsi: update pipe_screen::num_contexts This allows skipping mutex locking. Don't take the aux context into account. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9356>	2021-03-11 05:05:39 +00:00
Marek Olšák	981e55d530	gallium: add pipe_screen::num_contexts for skipping mutex locking in util_range Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9356>	2021-03-11 05:05:39 +00:00
Marek Olšák	728aa749ea	gallium/u_threaded: don't sync in create_stream_output_target Manhattan needs this. radeonsi can handle it since https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9028/diffs?commit_id=33ac9dec91d07ef353e110ac376842d84ec539b4. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9356>	2021-03-11 05:05:39 +00:00
Rob Clark	c4e5beef07	freedreno: threaded_context async flush support Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:16 +00:00
Rob Clark	9dbe2405a3	freedreno: threaded_context support Currently only initialized for a6xx, mostly because that is the easiest setup for me to test and debug at the moment. But the couple a6xx changes should not require counterparts in older gens. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:16 +00:00
Rob Clark	1a12d682a6	freedreno: Check cb0 in rebind_resource() Previously we were expecting cb0 to be user_buffer. (We did in some cases upload it to a gpu buffer, but this was an internally allocated buffer and not something subject to rebind.) But with TC it becomes a gpu buffer. (Technically, with pctx->const_uploader, we shouldn't hit the rebind path for cb0, but better to not try to be overly clever.. sooner or later that would bite us.) Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:16 +00:00
Rob Clark	00eb60ee59	freedreno/a6xx: Move UBWC demotion to first sampler view bind With threaded_context, CSO creation happens in the frontend thread, which means it is no longer safe to do blits (if needed, for sampler views with format that cannot be UBWC). So move this to the first time that the sampler view is bound. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:16 +00:00
Rob Clark	acc2c015b3	freedreno: Add transfer_pool_unsync With threaded_context, in the TC_TRANSFER_MAP_UNSYNC case, we are getting called from the frontend thread, rather than driver thread. So we need a different slab_child_pool for that. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:16 +00:00
Rob Clark	0c163e0a45	freedreno: Add fd_replace_buffer_storage() This will be used by threaded_context to avoid stalls in the DISCARD_WHOLE_RESOURCE case (and DISCARD_RANGE cases that can be promoted to DISCARD_WHOLE_RESOURCE). Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:16 +00:00
Rob Clark	63649e4101	freedreno: Extract out helper for transfer-map flag munging Split out the usage simplification from main part of transfer_map and handle the threaded-context specific TC_TRANSFER_x flags. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:16 +00:00
Rob Clark	4f07a24e41	freedreno: Extend threaded_transfer Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:16 +00:00
Rob Clark	1017dc9f6e	freedreno: Extend threaded_resource No functional change, just big churny Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:16 +00:00
Rob Clark	5fbaa8033b	freedreno: Restructure transfer_map() Separate the parts that, with threaded_context, can be called from either driver or frontend thread. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:16 +00:00
Rob Clark	39d6343a3e	freedreno: Split out batch/resource tracking For threaded_context, to properly handle replace_buffer_storage, we'll need to handle multiple "iterations" of a resource using the same tracking in order to implement transfer_map() correctly. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:16 +00:00
Rob Clark	f74ccde2c7	freedreno: Factor out common fd_resource init Before adding new things that would need initialization in both paths, refactor out a shared helper. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:15 +00:00
Rob Clark	bcf4562528	freedreno: Fix u_blitter constant-buffer leak We didn't see this before without threaded_context because we (normally) wouldn't upload cb0 (the slot u_blitter uses). But with cb0 getting uploaded we could hit a leak due to constant state only being restored in the fd_blitter_clear() path. Move cb0 save to the one path that uses it. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:15 +00:00
Rob Clark	9425b1343e	gallium/u_threaded: use mesa_log for debug msgs On android, this will show up in logcat, rather than being lost into the ether. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:15 +00:00
Rob Clark	f2f72ec3fe	gallium/u_threaded: Add helper to assert driver thread Useful for drivers to add some sanity checks to avoid/detect threading issues caused by things that might be called (indirectly) from frontend thread. Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:15 +00:00
Rob Clark	d2a920ee6e	util: Extract thread-id helpers from u_current Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9323>	2021-03-11 04:42:15 +00:00
Timothy Arceri	1772569449	Revert "glsl: default to compat shaders in compat profile" This reverts commit `6c8cc9be12`. A spec bug was resolved confirming the original behaviour. Also it seems the game Foundation no longer depends on the incorrect behaviour. Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9486>	2021-03-11 04:09:49 +00:00
Douglas Anderson	217d6594de	gallium/indices: Use "__restrict" to help the compiler In a perf trace translate_quads_uint2uint_last2last_prdisable() was showing up as a huge hot spot. Digging through the assembly on arm64 found that the compiler wasn't doing any read caching. Specifically, the generated code looked roughly like this: out[j+0] = in[i+0]; out[j+1] = in[i+1]; out[j+2] = in[i+3]; out[j+3] = in[i+1]; out[j+4] = in[i+2]; out[j+5] = in[i+3]; ...and the compiler was loading "i+1" and "i+3" from memory twice for no reason (instead of caching it). If we sprinkle generous amounts of the `__restrict` keyword then the compiler is able to be much smarter. Not only does it avoid double-loading but it also generates better instructions. It uses two LDRD instructions instead of 6 LDR instructions and uses some STRD too. In one example test this increased FPS from ~25.7 to ~34.5. Change-Id: I88bf8bd9ac421fe48a7d6961e224425c3ae7beee Reported-by: Rob Clark <robdclark@chromium.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9485>	2021-03-11 03:14:31 +00:00
Jason Ekstrand	e7e297732e	vulkan/alloc: Use char * for pointer arithmetic MSVC doesn't like arithmetic on void *. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9511>	2021-03-10 20:59:59 -06:00
Jason Ekstrand	492b5577f0	vulkan/util: Add a type parameter to vk_multialloc_add We also switch from using __alignof__ to alignof() in util/macros.h which works on MSVC with the one unfortunate downside of requiring an actual type and not a value. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9511>	2021-03-10 20:59:56 -06:00
Jason Ekstrand	c120edd8e8	vulkan/alloc: Add VK_MULTIALLOC_DECL macros These both declare the variable and add it to the allocator in one go. Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9511>	2021-03-10 20:59:55 -06:00
Jason Ekstrand	5afdbfe0c8	vk/alloc: Handle zero sizes better in vk_multialloc_add Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9511>	2021-03-10 20:59:53 -06:00
Jason Ekstrand	c22267262e	vulkan: Use ALWAYS_INLINE for multialloc This way it properly compiles on Visual Studio. Fixes: `145444d265` "anv: Move multialloc to common code" Acked-by: Daniel Stone <daniels@collabora.com> Acked-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9506>	2021-03-10 23:15:17 +00:00
Anuj Phogat	96e251bde7	intel: Rename "GEN_" prefix used in common code to "INTEL_" This patch renames all macros with "GEN_" prefix defined in common code. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>	2021-03-10 22:23:51 +00:00
Anuj Phogat	65d7f52098	intel: Fix broken alignment due to gen_ prefix renaming Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>	2021-03-10 22:23:51 +00:00
Anuj Phogat	692472a376	intel: Rename "gen_" prefix used in common code to "intel_" This patch renames functions, structures, enums etc. with "gen_" prefix defined in common code. Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>	2021-03-10 22:23:51 +00:00
Anuj Phogat	733b0ee8cb	intel: Rename files with gen_ prefix in common code to intel_ Changes in this patch include: - Rename all files in src/intel/common path - Update the filenames used in source and build files Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>	2021-03-10 22:23:51 +00:00
Jason Ekstrand	b9e9f92f73	intel/fs: Handle payload node interference in destinations Starting with `d0d039a4d3`, we emit writes to the push constant chunk of the payload to stomp out-of-bounds data to zero for Vulkan. Then, in `369eab9420`, we started emitting shader preamble code for emulated push constants on Gen12.5 parts. In either of these cases, we can run into issues if we don't have a proper live range for some of the payload registers where they get used for something and then smashed by our push handling code. We've not seen many issues with this yet because it only happens when you have dead push constants. Fixes: `d0d039a4d3` "anv: Emit pushed UBO bounds checking code..." Fixes: `369eab9420` "intel/fs: Emit code for Gen12-HP indirect..." Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9501>	2021-03-10 22:17:41 +00:00
Jason Ekstrand	8b7c2f1800	intel/fs: Use INTEL_MASK for pushish constant address masking It's easier to compare with the HW docs than a pile of hex. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9501>	2021-03-10 22:17:41 +00:00
Yannik Marek	369f9d225d	turnip: fix alpha to coverage in no color and unused attachment cases In cases where the alpha coverage is enabled but the color attachment is either unused or absent there should be a dummy mrt to make the draw behave correctly. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Matt Turner <mattst88@gmail.com> Signed-off-by: Yannik Marek <yannik@marek.ca> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8952>	2021-03-10 22:02:43 +00:00
Adam Jackson	ea27f2bf09	zink: Fix a thinko in instance setup It really does help to size these arrays correctly. Fixes: `2b4fcf0a06` zink: generate instance creation code with a python script Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9499>	2021-03-10 20:19:00 +00:00
Matt Turner	6ceb6b509e	turnip: Remove unused TU_DEBUG_IR3 flag Replaced by IR3_SHADER_DEBUG=disasm,{vs,...,cs} and unused since the commit referenced below. Fixes: `808992fc50` ("tu: Use the ir3 shader API") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8249>	2021-03-10 18:59:22 +00:00
Eric Anholt	eba1b2a1ba	ci/freedreno: Mark another a5xx TF flake. Showed up with an iommu fault preceding it each time it failed. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9488>	2021-03-10 18:44:16 +00:00
Marek Olšák	e39336a21e	radeonsi: enable RGP on gfx10.3 It seems to work on VanGogh. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9492>	2021-03-10 18:31:04 +00:00
Jason Ekstrand	5d8fa880d6	radv: Drop CreateRenderPass We can use the generic fall-back which calls CreateRenderPass2 instead. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>	2021-03-10 18:17:31 +00:00
Jason Ekstrand	8304b4eef7	radv/meta: Use CreateRenderPass2 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>	2021-03-10 18:17:31 +00:00
Jason Ekstrand	24414e7ec4	anv: Drop CreateRenderPass Fall back to the common implementation instead. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>	2021-03-10 18:17:31 +00:00
Jason Ekstrand	b302159b1c	vulkan: Preserve preserve attachments in CreateRenderPass This is trivial so I really don't know why it wasn't handled in the initial turnip code. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>	2021-03-10 18:17:31 +00:00
Jason Ekstrand	147187f754	vulkan: Add some asserts and checks for multiview in CreateRenderPass Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>	2021-03-10 18:17:31 +00:00
Jason Ekstrand	5de355b0f9	vulkan: Use correct aspectMask in CreateRenderPass If a VkRenderPassInputAttachmentAspectCreateInfo is provided, we use the aspects specified there. Otherwise, we default to every aspect in the format. For attachments which are not input attachments, aspectMask is left zero. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>	2021-03-10 18:17:31 +00:00
Jason Ekstrand	4fb6c051c9	anv: Move vk_format helpers to common code The Android ones we put in anv_android.c. Maybe one day we'll want a vk_android.h to put some common Android stuff but, for now, let's keep it contained to ANV's android code. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>	2021-03-10 18:17:31 +00:00
Jason Ekstrand	c7345bd1fb	vulkan: Use VK_MULTIALLOC in CreateRenderPass The variable-length stack allocations are causing issues with ubsan when the array size is zero. Also, a heap allocation is probably safer. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>	2021-03-10 18:17:31 +00:00
Jason Ekstrand	145444d265	anv: Move multialloc to common code Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>	2021-03-10 18:17:31 +00:00
Jason Ekstrand	2523c47720	turnip: Move the CreateRenderPass wrapper to common code Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8857>	2021-03-10 18:17:31 +00:00
Marek Olšák	3b7b2df509	ac: remove switch cases for pc_lines for compute-only chips Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>	2021-03-10 18:02:28 +00:00
Marek Olšák	975e5e262b	ac,radeonsi: use correct VGPR granularity on Aldebaran Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>	2021-03-10 18:02:28 +00:00
Marek Olšák	a9da3fc0d1	ac: handle bigger instruction prefetch for Aldebaran Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>	2021-03-10 18:02:27 +00:00
Marek Olšák	9fdf69e611	ac/llvm: unpack thread IDs on Aldebaran Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>	2021-03-10 18:02:27 +00:00
Marek Olšák	6edf1978d3	ac: set the TCC line size for Aldebaran Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>	2021-03-10 18:02:27 +00:00
Marek Olšák	230a6dc55d	ac,radeonsi: add sampler changes for Aldebaran - no 3D and cube textures - no mipmapping - no border color - image_sample is the only supported opcode with a sampler (behaves like _lz) Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>	2021-03-10 18:02:27 +00:00
James Zhu	381d3a5a38	amd: add Aldebaran chip enum Signed-off-by: James Zhu <James.Zhu@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>	2021-03-10 18:02:27 +00:00
Danylo Piliaiev	2764cf8d32	ir3: use OPC_GETBUF to get size of sampler buffers The maximum value which OPC_GETSIZE could return for one dimension is 0x007ff0, however sampler buffer could be much bigger. Blob uses OPC_GETBUF for them. Fixes tests: dEQP-VK.memory.pipeline_barrier.transfer_dst_uniform_texel_buffer.1048576 Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9391>	2021-03-10 17:10:45 +00:00
Danylo Piliaiev	8e6ed9948e	freedreno/a5xx: port handling of PIPE_BUFFER textures from a6xx Otherwise, we won't be able to use OPC_GETBUF to get their size. After this change we also could get rid of the hack for OPC_GETSIZE which scaled the size for texture buffers. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9391>	2021-03-10 17:10:44 +00:00
Danylo Piliaiev	d968995c67	turnip: fix SP_HS_WAVE_INPUT_SIZE value It appears that storage for varyings in a wave has an upper limit of wavesize * max_a831 where max_a831 is 64. Exceeding the limit seam to force gpu to reduce primitives processed per wave, at least calculations make sense with such interpretation. With blob SP_HS_WAVE_INPUT_SIZE never exceeds 64 and setting it to 65 in freedreno leads to a hang. Copied from the commit to freedreno `e5499ca2` Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8187>	2021-03-10 16:50:11 +00:00
Connor Abbott	7b7532b806	freedreno/computerator: Add branching example Mainly to be able to test label resolution without having to replace a shader. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>	2021-03-10 16:23:04 +00:00
Connor Abbott	19c7b6f9d6	ir3/parser: Add ability to specify branchstack This lets you test branching with computerator. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>	2021-03-10 16:23:04 +00:00
Connor Abbott	a820eb537c	ir3/parser: Support labels This fixes the assembly for many scenarios where you want to use shader replacement. Note: unfortunately this leaks the identifier string created while lexing, but I couldn't find a way to avoid leaking it except for bringing in ralloc or something (which would be way more complicated). The only other place doing something similar in mesa is the glsl parser, which is using ralloc (actually a linear context). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>	2021-03-10 16:23:04 +00:00
Connor Abbott	534658f79b	freedreno/computerator: Fix example assembly Use the new bindless cat6 syntax for a6xx. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>	2021-03-10 16:23:04 +00:00
Connor Abbott	cd772d5687	ir3/parser: Fix parsing of "0.0" in @const line Trying to specify a floating-point value in a @const line would result in it getting interpreted as a FLUT value and failing parsing. Fix this by making the various FLUT tokens include the surrounding parentheses. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9463>	2021-03-10 16:23:04 +00:00
Marek Vasut	f7dc0520d9	etnaviv: Fix point sprite Z,W coordinate replacement Mesa fixed pipeline texture loading on programmable pipeline hardware emits a generic fragment shader program which contains gl_TexCoord.xyzw as a vec4 and then expects to configure the varying assignments to the shader in the pipeline command stream, to select what is wired to the XYZW fragment shader inputs. This gl_TexCoord.xyzw is turned into texture load with projection (TGSI TXP opcode, similar for NIR). Texture load with projection does not exist in the Vivante GPU as a dedicated opcode and is emulated. The shader program first divides texture coordinates XYZ by projector W and then applies regular TEX opcode to load the texture (i.e. TEX(gl_TexCoord.xyzw/gl_TexCoord.wwww)). For point sprites, XY are the point coordinates from VS, Z=0 and W=1, always. The Vivante GPU can only configure varying to be either of -- point coord X, point coord Y, used, unused -- which covers XYZ, but not W. Z is fine because unused means 0. W used to be 0 too before this patch and that led to division by 0 in shader. The only known way to solve this is to set Z=0, W=1 in the shader program itself if the point sprites are enabled. This means we have to generate a special shader variant which does extra SET to set the W=1 in case the point sprites are enabled. In case of TGSI, emitting the SET.TRUE opcode permits setting W=1 without allocating additional constants. With NIR, use nir_lower_texcoord_replace() to lower TEXn to PNTC, which sets Z=0, W=1, and let NIR optimize the shader. Note that nir_lower_texcoord_replace() must be called before input linking is set up, as it might add new FS input. Also note that it should be possible to simply drop PIPE_CAP_POINT_SPRITE in the long run, ST would then apply the same optimization pass, but that option is so far misbehaving. And for etnaviv TGSI this is not applicable yet. This fixes neverball point sprites (exit cylinder stars) and eglretrace of gl4es pointsprite test: https://github.com/ptitSeb/gl4es/blob/master/traces/pointsprite.tgz Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8618>	2021-03-10 11:48:21 +00:00
Iago Toral Quiroga	8525cb1c53	v3dv: call util_cpu_detect() when initializing the instance Fixes this assert in debug builds: in __GI___assert_fail (assertion=0x7ffff731f66b "util_cpu_caps.nr_cpus >= 1", file=0x7ffff731f650 "../src/util/u_cpu_detect.h", line=116, function=0x7ffff7323280 <__PRETTY_FUNCTION__.11654> "util_get_cpu_caps") at assert.c:101 in util_get_cpu_caps () at ../src/util/u_cpu_detect.h:116 in _mesa_float_to_float16_rtz (val=0) at ../src/util/half_float.h:93 in util_format_r16g16b16a16_float_pack_rgba_float (dst_row=0x7fffffffbdc0 "", dst_stride=0, src_row=0x7fffffffbf90, src_stride=0, width=1, height=1) at src/util/format/u_format_table.c:13459 in util_format_pack_rgba (format=PIPE_FORMAT_R16G16B16A16_FLOAT, dst=0x7fffffffbdc0, src=0x7fffffffbf90, w=1) at ../src/util/format/u_format.h:1525 in util_pack_color (rgba=0x7fffffffbf90, format=PIPE_FORMAT_R16G16B16A16_FLOAT, uc=0x7fffffffbdc0) at ../src/gallium/auxiliary/util/u_pack_color.h:432 in v3dv_get_hw_clear_color (color=0x7fffffffbf90, internal_type=6, internal_size=8, hw_color=0x7fffffffbf10) at ../src/broadcom/vulkan/v3dv_cmd_buffer.c:1241 v2: move call from physical device to instance init. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9408>	2021-03-10 11:44:01 +01:00
Iago Toral Quiroga	c057a1211b	broadcom/compiler: disallow ldunif during ldvary sequences if possible This restores many of the hurt shaders from the previous patch at the expense of re-adding ldvary tracking in the scheduler. total instructions in shared programs: 13760415 -> 13755738 (-0.03%) instructions in affected programs: 1207560 -> 1202883 (-0.39%) helped: 5080 HURT: 1731 Instructions are helped. total max-temps in shared programs: 2322991 -> 2322828 (<.01%) max-temps in affected programs: 5063 -> 4900 (-3.22%) helped: 229 HURT: 108 Max-temps are helped. total sfu-stalls in shared programs: 31827 -> 31545 (-0.89%) sfu-stalls in affected programs: 478 -> 196 (-59.00%) helped: 304 HURT: 21 Sfu-stalls are helped. total inst-and-stalls in shared programs: 13792242 -> 13787283 (-0.04%) inst-and-stalls in affected programs: 1220856 -> 1215897 (-0.41%) helped: 5162 HURT: 1697 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9471>	2021-03-10 07:52:22 +00:00
Iago Toral Quiroga	947e9e42cc	broadcom/compiler: simplify ldvary pipelining We get optimal ldvary pipelining by doing the following: 1) Carefully merge a paired ldvary into the previous instruction when possible. 2) When the above succeeds, flag the ldvary as scheduled immediately so we can merge one of its children into the current instruction. 3) When scheduling ldvary sequences, only pick up instructions that are part of the sequence to avoid picking up something that prevents successful pipelining. This patch skips 3) assuming some hurt shaders in exchange for better scheduling flexibility during ldvary sequences. Besides eliminating most of the code dedicated to special handling ldvary sequences, this also usually allows us to produce better code by merging instructions that are unrelated to ldvary sequences into the ldvary sequences, which is particularly effective to fill up the gaps produced when scheduling the first and last ldvary sequences as well as the gaps produced by flat and noperspective varyings sequences that don't have both mul and add instructions. Notice that there are some hurt shaders, because some times the extra scheduler flexibility can lead to picking up instructions that will break a sequence without compensating for that, typically an ldunif that prevents us from doing the fixup for a follow-up ldvary. We will try to correct some of these cases with the next patch. total instructions in shared programs: 13786037 -> 13760415 (-0.19%) instructions in affected programs: 3201387 -> 3175765 (-0.80%) helped: 16155 HURT: 4146 Instructions are helped. total max-temps in shared programs: 2324834 -> 2322991 (-0.08%) max-temps in affected programs: 22160 -> 20317 (-8.32%) helped: 1340 HURT: 103 Max-temps are helped. total sfu-stalls in shared programs: 30685 -> 31827 (3.72%) sfu-stalls in affected programs: 782 -> 1924 (146.04%) helped: 253 HURT: 1416 Inconclusive result. total inst-and-stalls in shared programs: 13816722 -> 13792242 (-0.18%) inst-and-stalls in affected programs: 3171642 -> 3147162 (-0.77%) helped: 15331 HURT: 4179 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9471>	2021-03-10 07:52:22 +00:00
Iago Toral Quiroga	d37241bdc4	broadcom/compiler: move code block around These checks depend on prev_inst being set, so move them down below with all the other checks with the same requirement. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9471>	2021-03-10 07:52:22 +00:00
Iago Toral Quiroga	8bcda472a0	broadcom/compiler: add an additional sanity check assert to the ldvary fixup Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9471>	2021-03-10 07:52:22 +00:00
Samuel Pitoiset	077775f3ce	radv: check if dynamic line stipple state changed Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9458>	2021-03-10 07:21:46 +00:00
Samuel Pitoiset	892987e3a0	radv: check if dynamic VRS state changed Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9458>	2021-03-10 07:21:46 +00:00
Samuel Pitoiset	ed391a62f6	radv: do not declare push constants for DCC decompress on compute We don't use push constants at all. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9475>	2021-03-10 07:50:31 +01:00
Sagar Ghuge	0314c7503f	intel/blorp: Fix condition to figure out aux_address Fixes: `4dfabac4` ("blorp/gen12: Don't use aux address if implicit CCS") Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Mark Janes <markjanes@swizzler.org> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9491>	2021-03-09 22:39:43 -08:00
Sagar Ghuge	e3d221838a	Revert "Revert "blorp/gen12: Don't use aux address if implicit CCS"" This reverts commit `cbd5d82bae`. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9491>	2021-03-09 22:39:20 -08:00
Mark Janes	cbd5d82bae	Revert "blorp/gen12: Don't use aux address if implicit CCS" This reverts commit `4dfabac493`. The offending commit broke tens of thousands of tests in Intel's Mesa CI. Iris asserted in iris_use_pinned_bo at: assert(bo->kflags & EXEC_OBJECT_PINNED); Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9489>	2021-03-09 18:06:50 -08:00
Eric Anholt	dfb0e0d246	freedreno/a5xx: Flush depth at the end of sysmem, like a6xx does. On a6xx, this flush fixed some force-bypass tests. Doesn't affect anything in our current a5xx test set. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9481>	2021-03-09 21:46:58 +00:00
Eric Anholt	3c96880e13	freedreno/a5xx: Introduce an event write helper like a6xx has. This should help the next person trying to diff a5xx to a6xx behavior. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9481>	2021-03-09 21:46:57 +00:00
Marek Vasut	b19f1dc7d6	compiler/nir: Increment shader input count and mark as used when adding new gl_PointCoord In case a new gl_PointCoord shader input is created, increment shader input count and set valid driver_location to the new input variable, otherwise the input gets aliased to input 0 and shows up in NIR_PRINT output as whatever shader input 0 is instead of gl_PointCoord. Also set the input as used, otherwise it might get removed. Signed-off-by: Marek Vasut <marex@denx.de> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9214>	2021-03-09 21:24:35 +00:00
Dave Airlie	8027a7ba8a	shader_info: convert textures_used to a bitset. For now keep it a bitset of 1 32-bit dword. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9456>	2021-03-10 06:16:09 +10:00
Dave Airlie	c55bd4b68d	util/bitset: add a new last bit api This is to be used where the bitset is a predefined array size. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9456>	2021-03-10 06:16:05 +10:00
Dave Airlie	0e1afe7c70	util/panfrost/glsl: rename BITSET_LAST_BIT to BITSET_LAST_BIT_SIZED The current users all pass in the number of dwords, but I'd like to provide an interface that doess ARRAY_SIZE implicitly. Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9456>	2021-03-10 06:15:50 +10:00
Chad Versace	d978383966	anv/image: Make memory layout more explicit Future patches for VK_EXT_image_drm_format_modifier will, in some cases, place the aux surface and fast clear state into a driver-private bo. This increases the complexity of image memory layout to such a degree that, to maintain sanity, we must improve how we track the layout. Define new types: - anv_image_memory_range - anv_image_memory_binding - anv_image_binding Delete many fields in anv_image (and its children), and replace them with the new types. This patch does not change how anv_image tracks (or, rather, does not track) the memory of gen12 implicit ccs. We should probably do that, but that's left as a future exercise. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Chad Versace	4dfabac493	blorp/gen12: Don't use aux address if implicit CCS Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Chad Versace	bb7d627865	anv/image: Add anv_image_address() It calculates the address to a surface or to metadata in the image. Refactor only. No intended change in behavior. This patch prepares for, and reduces much noise in, the upcoming patch that rewrites image memory tracking. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Chad Versace	1ef0fd3b70	anv: Refactor anv_image_get_compression_state_addr Reduces noise in the path that introduces anv_image_mem_range. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Chad Versace	22ac3d74e0	anv/image: Clean up anv_GetImageMemoryRequirements2 If the image is disjoint, there is no reason to calculate image-global memory requirements. Instead, only per-plane memory requirements are needed. Also, delete a large duplicate comment. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Chad Versace	ffc08351e1	anv: Add anv_surface_is_valid() Current code checks for surface validity with `surface.isl.size_B > 0`. Replace the checks with anv_surface_is_valid(). This prepares for adding new members to anv_surface that may be accidentally used as a validity-indicator. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Chad Versace	3e6d3bca1d	anv/android: Fix size check for imported gralloc bo 1. Don't compare bo->size to image->size. An upcoming patch replaces anv_image::size with complicated stuff. Instead, properly query the required size with anv_GetImageMemoryRequirements. 2. Require the bo to fit the aligned image size. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Chad Versace	449df3808f	anv/image: Fix interpretation of 'disjoint' The calculation of the subsurfaces' memory requirements assumed that the image was disjoint if the image was created with VK_IMAGE_CREATE_DISJOINT_BIT. But the Vulkan spec also requires that the VkFormat be multi-planar. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Chad Versace	6fa56273be	anv/image: Drop duplicate 'format' in anv_image_create() Reduces the chance of misusing unitialized 'n_planes' and 'format' during image creation. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Chad Versace	2328edbb62	anv/image: Move vkGetImageMemoryRequirements Move from anv_device.c to anv_image.c, to live alongside vkBindImageMemory* and related code. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Chad Versace	5065faca00	anv/image: Rename anv_image_plane::surface -> primary_surface This disambiguates code that accesses `image->planes[*].surface`. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Chad Versace	e7844c552c	anv/image: Replace bo_is_owned with from_gralloc (v2) The name anv_image_plane::bo_is_owned will be made ambiguous by the implementation of VK_EXT_image_drm_format_modifier, which may bind the plane to multiple bo's. Also, bo_is_owned was set if and only if the image was imported from gralloc, and it was set only on the first plane. Therefore, let's rename the field to from_gralloc, and move it to the toplevel of anv_image. v2: Fix build in anv_android.c. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8097>	2021-03-09 18:42:20 +00:00
Mike Blumenkrantz	5945d7d2e9	zink: fix instance/device versioning (for real this time) the maximum allowable runtime version of vk can be computed by MIN(instance_version, device_version) despite this, instances and devices can be created using the maximum version available for each respective type. the restriction is applied only at the point of enabling/applying features and extensions, meaning that to correctly handle this, zink must: 1. create an instance using the maximum allowable version 2. select a physical device using the instance 3. compute MIN(instance_version, device_version) 4. only now begin to enable/use features requiring vk 1.1+ ref #4392 Reviewed-by: Adam Jackson <ajax@redhat.com> Acked-by: Hoe Hao Cheng <haochengho12907@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9479>	2021-03-09 18:33:15 +00:00
Danylo Piliaiev	1d70863c12	freedreno/hw: fix populating branch targets in isa_decode pre-pass pre-pass ran with branch_labels being false which made it no-op. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9476>	2021-03-09 18:17:48 +00:00
Simon Ser	71e8141503	egl: use render node for wl_drm if available This causes clients to use the render node and skip DRM authentication if a DRM render node is available. Signed-off-by: Simon Ser <contact@emersion.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9334>	2021-03-09 15:43:51 +00:00
Georg Lehmann	fb1100d718	vulkan/device_select: Only call vkGetPhysicalDeviceProperties2 if the device supports it. vkGetPhysicalDeviceProperties2 is not allowed to be used with a 1.0 device because it's a vulkan 1.1 function. Closes: #4396 Fixes: `38ce8d4d` ("vulkan/device_select: Stop using device properties 2.") Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9462>	2021-03-09 15:00:57 +00:00
Gert Wollny	8bc9ae1bc6	virgl: implement support for PIPE_CAP_STRING_MARKER With this command implemented messages emitted by applications via glDebugMessageInsert will be forwarded to the host. v2: - remove check for feature in encode function, this is covered in the state tracker (Rohan) - reorder parameters in the encode function to the order of the emit callback Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Reviewed-by: Rohan Garg <rohan.garg@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9433>	2021-03-09 13:57:05 +00:00
Jason Ekstrand	1399ee5cf9	anv: Drop anv_extensions.py This should have been dropped in `27d496706e`. Acked-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9469>	2021-03-09 10:36:19 +00:00
Fan Yugang	6905122999	intel/tools: Show unknown instructions in decoded state. Signed-off-by: Fan Yugang <yugang.fan@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9455>	2021-03-09 09:36:08 +00:00
Christian Gmeiner	f532202f2d	etnaviv: use nir_lower_idiv(..) before opt loop nir_lower_idiv(..) creates during its lowering isub instructions. Move nir_lower_idiv(..) before the opt loop to have a chance to optimize/lower isub away. Also drop the drop the halti dependency to make it easier to follow. This fixes the following assert on GC3000: Unhandled ALU op: isub Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9447>	2021-03-09 06:45:31 +00:00
Mike Blumenkrantz	279ef45db5	zink: unref ctx->framebuffer on context destroy we aren't guaranteed to get a final set_framebuffer_state(NULL) to do this for us Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9434>	2021-03-09 03:11:40 +00:00
Mike Blumenkrantz	8937b5f268	zink: don't pass so_info to ntv at all unless it's necessary this is only needed for explicit xfb outputs Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	7ed57e60fc	zink: only export necessary xfb outputs to ntv the full-variable outputs can be skipped, leaving only the varyings which actually need explicit emission due to packed layouts or whatever Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	1f42ff77df	zink: use slightly stricter check for update_so_info() callsite Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	0fb7680b26	zink: pass so_info directly to update_so_info() Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	0d741b8dfe	zink: use info.has_transform_feedback_varyings to determine xfb enablement Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	eebd00329f	zink: rename variable in update_so_info() be more consistent Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	5c5e1abea2	zink: evaluate existing slot map during program init and force new map as needed if the number of explicit xfb outputs or new varyings added to the existing size of the slot map would cause an overflow, we have to force a new slot map to ensure that everything fits this means iterating all the stages which can produce new varyings and calculating all the slots required in order to compare against the max size available Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	6d40db84c9	zink: handle direct xfb output from output variables if an entire variable is being dumped into an xfb buffer, there's no need to create an explicit xfb variable to copy the value into, and instead the xfb attributes can just be set normally on the variable this doesn't work for geometry shaders because outputs are per-vertex fixes all KHR-GL46.enhanced_layouts xfb tests Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	7cef91dd43	zink: stop allocating xfb slot map this can just be inlined since it's a small static size Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	086262fc53	zink: run more nir passes for tess shaders running nir_lower_io_arrays_to_elements_no_indirects for only some stages breaks location-setting for the stages which don't run it when e.g., dmat2x3 variables are sometimes split across locations and sometimes jammed into a single location (TCS I'm looking at you) Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	6d8b5e7f09	zink: fix location usage for explicit xfb outputs ensure that this accurately handles multi-slot emission Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	96024a8dc9	zink: fix slot mapping for fat io variables big types like dmat2x3 need multiple slots, and trying to jam them into single slots breaks everything Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	1b25e3a701	zink: fix streamout emission for super-enhanced layouts if we get some crazy matrix types in here then we need to ensure that we accurately unwrap them and copy the components fixes KHR-GL46.enhanced_layouts.xfb_stride Fixes: `1b130c42b8` ("zink: implement streamout and xfb handling in ntv") Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Mike Blumenkrantz	9ff01d724a	zink: remove ntv streamout assert this was added during review, but it was never correct and just crashes valid cases like streamout from a mat3x4 type Fixes: `b6f8f3a3ba` ("zink: fix streamout for clipdistance") Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>	2021-03-09 02:52:20 +00:00
Jesse Natalie	fe90bcf11a	microsoft/compiler: Don't separate phis while inserting upcasts Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4414 Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9464>	2021-03-09 01:41:32 +00:00
Jesse Natalie	ef0d2a5b4b	nir: Add a nir_after_instr_and_phis helper Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9464>	2021-03-09 01:41:32 +00:00
Jason Ekstrand	25020c125a	intel/mi_builder: Fix a couple of #ifs All this does is remove a field on Gen7 and stop asserting on it. No actual functional change. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9467>	2021-03-08 16:14:13 -06:00
Jason Ekstrand	62c64e7b9d	intel/mi_builder: Fix some indentation This got lost in the rebase on top of the s/gen_mi_/mi_/ change Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9467>	2021-03-08 16:13:37 -06:00
Jordan Justen	45e5c6b641	anv: Add mem heap/type support for local-mem This will take effect in future patches when we are able to query the kernel to set device->vram.size to a non-zero size. Builds on Sagar's ("anv: Query memory region info") patch, and re-organizes things as recommended by Lionel (and Jason). Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9324>	2021-03-08 12:47:06 -08:00
Jordan Justen	7c41ae0a81	anv: Put cache memory type first on non-llc platforms Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9324>	2021-03-08 12:47:06 -08:00
Jordan Justen	fd98721cba	anv: Restructure mem heap/type init code Just treat the llc and non-llc paths as separate cases. This will also help when adding the local memory setup. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9324>	2021-03-08 12:47:06 -08:00
Sagar Ghuge	835c257f64	anv: Add anv_memregion structure Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9324>	2021-03-08 12:47:06 -08:00
Caio Marcelo de Oliveira Filho	a41c3ed384	spirv: Update a couple of comments in variable handling Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9440>	2021-03-08 20:23:28 +00:00
Caio Marcelo de Oliveira Filho	3a7bb38b70	spirv: Explicitly break when finished handling SpvDecorationBuiltIn When tyding up this section in `1e5b09f42f` ("spirv: Tidy some repeated if checks by using a switch statement.") the break got lost. It is not a real problem because the next case just break, but better to have it explicitly here instead of a FALLTHROUGH. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9440>	2021-03-08 20:23:28 +00:00
Caio Marcelo de Oliveira Filho	94d2a51453	spirv: Reuse nir_is_per_vertex_io() Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9440>	2021-03-08 20:23:28 +00:00
Eric Anholt	f301eec9a3	nir-to-tgsi: Fix handling of partial writemasks on SSA/REG decls. In nouveau's PBO path with GS support and no VS layer export, we got: intrinsic store_output (ssa_1, ssa_0) (0, 15, 0, 160, 128) /* base=0 / / wrmask=xyzw / / component=0 / / src_type=float32 / / location=0 slots=1 / / out_pos / [...] vec3 32 ssa_4 = mov ssa_3.xxx intrinsic store_output (ssa_4, ssa_0) (0, 4, 0, 160, 128) / base=0 / / wrmask=z / / component=0 / / src_type=float32 / / location=0 slots=1 // out_pos */ The mov's SSA value we would decide we could store directly to the output, since nothing else used it. However, the store has a writemask, and the ALU op was stomping over it instead of ANDing with the output decl's existing writemask. Fixes: `f79f382c81` ("nir_to_tgsi: Store directly to TGSI outputs when possible.") Closes: #4380 Tested-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9376>	2021-03-08 19:01:40 +00:00
Jason Ekstrand	e20e85f01e	nir: Make nir_ssa_def_rewrite_uses_after take an SSA value This replaces the new_src parameter of nir_ssa_def_rewrite_uses_after() with an SSA def, and rewrites all the users as needed. Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>	2021-03-08 16:59:55 +00:00
Jason Ekstrand	117668b811	nir: Make nir_ssa_def_rewrite_uses take an SSA value This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses() with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites all the users as needed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>	2021-03-08 16:59:55 +00:00
Jason Ekstrand	13a0ee8a51	nir: Add and use a new nir_ssa_def_rewrite_uses_src helper This is currently an alias for nir_ssa_def_rewrite_uses but we move all the instances which used it to write a non-SSA source to the newly named helper. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Acked-by: Alyssa Rosenzweig <alyssa@collabora.com> Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>	2021-03-08 16:59:55 +00:00
Jason Ekstrand	98a5b9b454	intel/mi_builder: Add control-flow support Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>	2021-03-08 10:47:19 -06:00
Jason Ekstrand	8525ebe6e3	intel/mi_builder: Return an address from __gen_get_batch_address While we're here, add __gen_get_batch_address declarations to more files because we're about to start requiring it on all GFX 12.5+. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>	2021-03-08 10:47:19 -06:00
Jason Ekstrand	322fba216b	intel/mi_builder: Use softpin for tests on gen8+ Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>	2021-03-08 10:47:19 -06:00
Jason Ekstrand	c23f7f1154	intel/batch_decoder: Don't follow predicated MI_BATCH_BUFFER_START The stuff after these may be executed so we want to decode it too. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>	2021-03-08 10:47:19 -06:00
Jason Ekstrand	6721925220	genxml: Clean up MI_SET_PREDICATE Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>	2021-03-08 10:47:19 -06:00
Jason Ekstrand	c7c524337a	intel/mi_builder: Add load/store_offest on GFX 12.5+ Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>	2021-03-08 10:47:18 -06:00
Jason Ekstrand	6323a8522b	intel/mi_builder: Support inverted values in mi_store Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>	2021-03-08 10:45:45 -06:00
Sagar Ghuge	04d0d4e849	intel/mi_builder: Added support for command streamer shift operations Add logical shift left and right operations support to mi_builder. v1: - Add GEN_GEN > 12 check (Jordan Justen) - Add gen_mi_has_shift function (Jordan Justen) - Fix commit title (Jordan Justen) v2 (Jason Ekstrand): - Add _imm versions of all of them - Better handle corner-cases in _imm helpers - Handle the power-of-two limitation for _imm versions - Add tests Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>	2021-03-08 10:45:42 -06:00
Jason Ekstrand	62b9e30cc7	intel/mi_builder: Add ieq/ine helpers Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>	2021-03-08 10:45:24 -06:00
Jason Ekstrand	2c02740a8c	intel/mi_builder: Use AddCSMMIOStartOffset for LRI In `06cf838cbd` we started using the AddCSMMIOStartOffset feature on Gen11+ but we missed one place. Fixes: `06cf838cbd` "intel/mi_builder: Support gen11 command-streamer..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>	2021-03-08 09:54:45 -06:00
Connor Abbott	ccd7986f59	freedreno/cffdec: Use rb trees for tracking buffers Gets rid of the arbitrary size limitation, and should make decoding faster with many buffers. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8838>	2021-03-08 15:18:47 +00:00
Marek Olšák	b43f40166c	ac/surface: select best swizzle mode for 3D sampler performance Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9448>	2021-03-08 11:41:23 +00:00
Marek Olšák	08ece5d6b3	driconf: add performance tweaks for viewperf Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9449>	2021-03-08 10:33:33 +00:00
Tony Wasserka	97c97781f6	aco: Fix vector::reserve() being called with the wrong size The container is moved from before and hence returns size 0. To get the correct value, the new instruction container must be used instead. This was flagged by clang-tidy. The fixed call still triggers the corresponding diagnostic, hence this change silences it by adding a redundant clear() after move. Fixes: `7f1b537304` ("aco: add new NOP insertion pass for GFX6-9") Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9432>	2021-03-08 10:44:20 +01:00
Alyssa Rosenzweig	e30994a471	nir/lower_viewport_transform: Allow geom/tess This pass needs to run on the last shader in a pipeline writing gl_Position. In GLES2, that's always the vertex shader, but in ES3.2, it can be a geometry or tessellation shader. The shared code works the same in this case, just make the assert more generous. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Erico Nunes <nunes.erico@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9444>	2021-03-07 17:57:04 +00:00
Alyssa Rosenzweig	3436e5295b	pan/bi: Treat +DISCARD.f32 as message-passing Likely errata, matches blob's handling. Closes #4387 total nops in shared programs: 86266 -> 86272 (<.01%) nops in affected programs: 347 -> 353 (1.73%) helped: 1 HURT: 2 total clauses in shared programs: 20813 -> 20833 (0.10%) clauses in affected programs: 343 -> 363 (5.83%) helped: 0 HURT: 20 Clauses are HURT. total quadwords in shared programs: 91572 -> 91588 (0.02%) quadwords in affected programs: 1322 -> 1338 (1.21%) helped: 1 HURT: 14 Quadwords are HURT. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Tested-by: Icecream95 <ixn@disroot.org> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9446>	2021-03-07 15:10:28 +00:00
Alyssa Rosenzweig	6cb1a9b754	pan/bi: Set clause_state.message conservatively Accidentally prevented scheduling message-passing instructions to anywhere but the last ADD of a clause. total nops in shared programs: 86280 -> 86266 (-0.02%) nops in affected programs: 1609 -> 1595 (-0.87%) helped: 9 HURT: 4 Inconclusive result (value mean confidence interval includes 0). total clauses in shared programs: 20993 -> 20813 (-0.86%) clauses in affected programs: 3488 -> 3308 (-5.16%) helped: 116 HURT: 0 Clauses are helped. total quadwords in shared programs: 91697 -> 91572 (-0.14%) quadwords in affected programs: 12257 -> 12132 (-1.02%) helped: 53 HURT: 2 Quadwords are helped. Fixes: `f0c0082ab0` ("pan/bi: Schedule blocks") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Tested-by: Icecream95 <ixn@disroot.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9446>	2021-03-07 15:10:21 +00:00
Alyssa Rosenzweig	6322bc544e	pan/bi: Mark message-passing sources/dests live More general, same data race. Fixes: `44726101d1` ("pan/bi: Don't fill garbage") Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Tested-by: Icecream95 <ixn@disroot.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9446>	2021-03-07 15:10:12 +00:00
Axel Davy	91755300ec	st/nine: Set default dynamic_texture_workaround to true Now the texture virtual memory usage is less of a problem, we can use this workaround permanently. In the spirit of the API it's certainly not the proper way of implementing DYNAMIC textures (it seems they are ok to have hidden copies in driver managed memory, but not have virtual addressing space reduced), but it makes sense for us, both performance wise, and to avoid bugs. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9377>	2021-03-07 13:13:53 +00:00
Axel Davy	0beb77751e	st/nine: Add driconf option to limit texture memory Signed-off-by: Axel Davy <davyaxel0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9377>	2021-03-07 13:13:53 +00:00
Axel Davy	24eb1f21d0	st/nine: Control the memfd virtual limit Signed-off-by: Axel Davy <davyaxel0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9377>	2021-03-07 13:13:53 +00:00
Axel Davy	a179ea2e6d	st/nine: Use the texture memory helper Switch to the new texture RAM memory API. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9377>	2021-03-07 13:13:53 +00:00
Axel Davy	90a7573a65	st/nine: Add RAM memory manager for textures On 32 bits, virtual memory is sometimes too short for apps. Textures can hold virtual memory 3 ways: 1) MANAGED textures have a RAM copy of any texture 2) SYSTEMMEM is used to have RAM copy of DEFAULT textures (to upload them for example) 3) Textures being mapped. Nine cannot do much for 3). It's up to driver to really unmap textures when possible on 32 bits to reduce virtual memory usage. It's not clear whether on Windows anything special is done for 1) and 2). However there is clear indication some efforts have been done on 3) to really unmap when it makes sense. My understanding is that other implementations reduce the usage of 1) by deleting the RAM copy once the texture is uploaded (Dxvk's behaviour is controlled by evictManagedOnUnlock). The obvious issue with that approach is whether the texture is read by the application after some time. In that case, we have to recreate the RAM backing from the GPU buffer. And apps DO that. Indeed I found that for example Mass Effect 2 with High Texture mods (one of the crash case fixed by this patch serie), When the character gets close to an object, a high res texture and replaces the low res one. The high res one simply has more levels, and the game seems to optimize reading the high res texture by retrieving the small-resolution levels from the original low res texture. In other words during gameplay, the game will randomly read MANAGED textures. This is expected to be fast as the data is supposed to be in RAM... Instead of taking that RAM copy eviction approach, this patchset proposes a different approach: storing in memfd and release the virtual memory until needed. Basically instead of using malloc(), we create a memfd file and map it. When the data doesn't seem to be accessed anymore, we can unmap the memfd file. If the data is needed, the memfd file is mapped again. This trick enables to allocate more than 4GB on 32 bits apps. The advantage of this approach over the RAM eviction one, is that the load is much faster and doesn't block the GPU. Of course we have problems if there's not enough memory to map the memfd file. But the problem is the same for the RAM eviction approach. Naturally on 64 bits, we do not use memfd. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9377>	2021-03-07 13:13:53 +00:00
Axel Davy	6087ff44ae	st/nine: Add new function to know if we are the worker This will be useful in a later patch Signed-off-by: Axel Davy <davyaxel0@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9377>	2021-03-07 13:13:53 +00:00
Ilia Mirkin	fd017458bc	mesa: fix fbo attachment size check for RBs, make it trigger in ES2 Makes dEQP-GLES2.functional.fbo.completeness.size.distinct pass. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9441>	2021-03-06 20:29:41 +00:00
Ilia Mirkin	a8044e87e7	mesa: fix conditions for fp16 render format eligibility GLES3 adds all of these, but they're also available in GLES2 with an ext. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4400 Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9441>	2021-03-06 20:29:41 +00:00
Karol Herbst	12f1e42ed3	tegra/context: unwrap indirect_draw_count as well Fixes: `22f6624ed3` "gallium: separate indirect stuff from pipe_draw_info - 80 -> 56 bytes" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9425>	2021-03-06 11:48:57 +00:00
Karol Herbst	a84c8ddb19	tegra/context: fix regression in tegra_draw_vbo We should only pass in a new indirect_info object if we actually set valid values in it. Fixes: `abe8ef862f` "gallium: make pipe_draw_indirect_info * a draw_vbo parameter" Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9425>	2021-03-06 11:48:57 +00:00
Icecream95	efd7711e0e	st/mesa: Update constants on alpha test change if it's lowered nir_lower_alpha_test creates a uniform for the alpha reference value; this needs to be updated when changing alpha test state. Fixes: `b1c4c4c7f5` ("mesa/gallium: automatically lower alpha-testing") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4390 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9439>	2021-03-06 00:32:51 +00:00
Dave Airlie	24ce0862fe	zink/ci: update results after layer extensions enabled in lavapipe Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9401>	2021-03-05 21:43:59 +00:00
Dave Airlie	d061e21b7e	lavapipe: enable EXT_shader_viewport_index_layer This is already implemented afaik Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9401>	2021-03-05 21:43:59 +00:00
Dave Airlie	dad5d5099a	llvmpipe: add support for shader viewport layer This should already be implemented just never enabled the CAP Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9401>	2021-03-05 21:43:59 +00:00
Dave Airlie	4cf898b988	draw/prim_assembler: write correct decomposed primitive lengths In order for shader viewport index to be calculated correctly, the cliptest code needs proper primitive lengths to work out the provoking vertex. I half fixed this before for GL4 but looks like I didn't make it all the way. This fixes: dEQP-VK.draw.shader_viewport* Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9401>	2021-03-05 21:43:59 +00:00
Dave Airlie	52dc22055f	draw: fix uses viewport index for tess eval shader Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9401>	2021-03-05 21:43:59 +00:00
Kenneth Graunke	cdffa3e114	vbo: Fix vbo_sw_primitive_restart for start > 0 Commit `e99e7aa4` began passing start > 0 to indexed draw calls rather than keeping start at 0 and manually advancing ib->ptr. This should work fine, however, there have been instances of software fallbacks not handling things right. vbo_sw_primitive_restart had a bug where it was ignoring "start" and always calling find_sub_primitives with start = 0 and end = ib->count. This meant that when start > 0, it was analyzing the wrong part of the index buffer when finding subprimitives. In theory, each _mesa_prim can have a different "start" value. But the code only calls find_sub_primitives once, because it wants to map, analyze, and unmap the index buffer before calling ctx->Draw, as some drivers don't support drawing with the index buffer mapped. To handle this, we break vbo_sw_primitive_restart calls into sections where "start" matches across all the primitives, similar to how I handled the issue in tnl in commit `bd6120f562`. In the common case, start matches and we handle it in one pass anyway. Fixes Piglit's primitive-restart VBO_COMBINED_VERTEX_AND_INDEX test and KHR-GL33.pipeline_statistics_query_tests_ARB.functional_primitives_vertices_submitted_and_clipping_input_output_primitives on Intel Ivybridge and older (which don't do arbitrary cut indices). Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4052 Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9417>	2021-03-05 21:16:32 +00:00
Adam Jackson	cf468b7ad8	zink: more and better debug printfs Use debug_printf more consistently, normalize formatting a bit, and trace a few more places you're likely to care about. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9436>	2021-03-05 15:03:09 -05:00
Gert Wollny	f3aa2f15c2	r600/sfn: eliminate loading unused component loads from shared memory LDS loads are quite expensive, so try to eliminate as many as possible Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9416>	2021-03-05 18:25:25 +00:00
Rhys Perry	9f8a0b797e	radv: cache pipeline statistics Applications rarely require them, but this improves fossil-db replay time. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9411>	2021-03-05 17:01:16 +00:00
Rhys Perry	7c7e8942f8	radv,aco: remove aco_compiler_statistics This removes a pointer from radv_shader_binary_legacy::data. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9411>	2021-03-05 17:01:16 +00:00
Lionel Landwerlin	8955d179d3	anv: fix MI_PREDICATE_RESULT write This register is only 32bits. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `1952fd8d2c` ("anv: Implement VK_EXT_conditional_rendering for gen 7.5+") Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9428>	2021-03-05 16:19:20 +00:00
Alyssa Rosenzweig	718bfdb3da	pan/bi: Implement fsin/fcos Instead of lowering it in NIR, use the lookup tables as inputs to a second-order Taylor expansion. shader-db results aren't amazing but keep in mind this is without backend CSE yet. total instructions in shared programs: 115913 -> 115707 (-0.18%) instructions in affected programs: 3151 -> 2945 (-6.54%) helped: 12 HURT: 0 Instructions are helped. total nops in shared programs: 84045 -> 84041 (<.01%) nops in affected programs: 1571 -> 1567 (-0.25%) helped: 1 HURT: 7 Inconclusive result (value mean confidence interval includes 0). total clauses in shared programs: 20498 -> 20489 (-0.04%) clauses in affected programs: 188 -> 179 (-4.79%) helped: 6 HURT: 0 Clauses are helped. total quadwords in shared programs: 90395 -> 90291 (-0.12%) quadwords in affected programs: 2287 -> 2183 (-4.55%) helped: 12 HURT: 0 Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9420>	2021-03-05 15:15:10 +00:00
Alyssa Rosenzweig	253b795451	pan/bi: Allow negating constants Useful for representing -0 in transcendental sequences matching the blob. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9420>	2021-03-05 15:15:10 +00:00
Alyssa Rosenzweig	362756ad09	pan/bi: Use replace_index in more places Needed to respect abs/neg. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9420>	2021-03-05 15:15:10 +00:00
Pierre-Eric Pelloux-Prayer	c276bde34a	radeonsi/sqtt: export shader code to RGP With these changes the shader code is visible in RGP. Vk pipeline feature is emulated using si_update_shaders: when shaders are updated we compute a sha1 of their code and use it as a pipeline hash. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	729d3eb0e0	radeonsi/sqtt: don't always use WGP 0 Because it may be disabled. Instead use the cu mask to pick the first active WGP. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	47eafb3f51	radeonsi/sqtt: remove duplicate token V_008D18_REG_INCLUDE_CONTEXT was set twice. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	a27ea38d2a	radeonsi/sqtt: keep a copy of the uploaded shader code Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	7f5a8db96d	ac/rgp: move radv/sqtt functions to ac pso_correlation and code_object_loader don't depend on drivers specific logic so move them to the shared code. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	b2ef94943f	ac/rtld: make ac_rtld_upload returns the code size This will be useful to keep a copy of the uploaded code. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	e5b1e645e7	ac/rgp: make the max gap between shader code a warning For radeonsi the shaders don't live in the same BOs, so they're unlikely to be less that 0x1000 bytes apart. So this commit bumps the threshold to 0x10000 and warns once when hitting it. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Pierre-Eric Pelloux-Prayer	0e97d817f5	radeonsi: properly set SPI_SHADER_PGM_HI_ES When not using S_00B324_MEM_BASE the value isn't properly truncated. Cc: mesa-stable Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>	2021-03-05 13:10:11 +00:00
Iago Toral Quiroga	6e6e71ddf9	broadcom/compiler: fix flags check for ldvary merge We were checking that the previous instruction doesn't write flags, but we also need to check it doesn't read them. Fixes: `1784dd22a3` ('broadcom/compiler: pipeline smooth ldvary sequences') Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9431>	2021-03-05 12:55:47 +00:00
Iago Toral Quiroga	21c1853c55	broadcom/compiler: ldvary doesn't implicitly write to r3 since V3D 4.1 total instructions in shared programs: 13805979 -> 13786037 (-0.14%) instructions in affected programs: 2263244 -> 2243302 (-0.88%) helped: 10646 HURT: 1508 Instructions are helped. total threads in shared programs: 412220 -> 412242 (<.01%) threads in affected programs: 58 -> 80 (37.93%) helped: 17 HURT: 6 Threads are helped. total uniforms in shared programs: 3793200 -> 3790401 (-0.07%) uniforms in affected programs: 131281 -> 128482 (-2.13%) helped: 1547 HURT: 281 Uniforms are helped. total max-temps in shared programs: 2326309 -> 2324834 (-0.06%) max-temps in affected programs: 31836 -> 30361 (-4.63%) helped: 1139 HURT: 153 Max-temps are helped. total spills in shared programs: 5932 -> 5940 (0.13%) spills in affected programs: 80 -> 88 (10.00%) helped: 2 HURT: 3 total fills in shared programs: 13370 -> 13372 (0.01%) fills in affected programs: 480 -> 482 (0.42%) helped: 2 HURT: 3 total sfu-stalls in shared programs: 30829 -> 30685 (-0.47%) sfu-stalls in affected programs: 2190 -> 2046 (-6.58%) helped: 570 HURT: 533 Sfu-stalls are helped. total inst-and-stalls in shared programs: 13836808 -> 13816722 (-0.15%) inst-and-stalls in affected programs: 2276152 -> 2256066 (-0.88%) helped: 10643 HURT: 1525 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9430>	2021-03-05 13:37:39 +01:00
Rhys Perry	524848707b	radv: don't set sx_blend_opt_epsilon for V_028C70_COLOR_10_11_11 Matches radeonsi and PAL. From PAL: // 1 is recommended, but doesn't provide sufficient precision Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4394 Fixes: `ed94638156` ("radv: Enable RB+ where possible.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9427>	2021-03-05 11:16:40 +00:00
Iago Toral Quiroga	839007e490	broadcom/compiler: always restart ldvary pipelining when scheduling ldvary When we were only able to pipeline smooth varyings, if we had to disable ldvary pipelining in the middle of a sequence it would stay disabled for the rest of the program, to prevent us from prioritizing scheduling of ldvary instructions that we would not be able to pipeline effectively. Now that we can pipeline all ldvary sequences we can change this. This change re-enables ldvary pipelining upon finding the next ldvary in the program in the hopes that we can continue pipelining succesfully. To do this, we track the number of ldvary instructions we emitted so far and compare that to the number of inputs in the fragment shader we are scheduling. This also allows us to simplify our ldvary tracking at nir to vir time, since that is all now handled in the QPU scheduler. total instructions in shared programs: 13817048 -> 13810783 (-0.05%) instructions in affected programs: 810114 -> 803849 (-0.77%) helped: 4843 HURT: 591 Instructions are helped. total max-temps in shared programs: 2326612 -> 2326300 (-0.01%) max-temps in affected programs: 4689 -> 4377 (-6.65%) helped: 285 HURT: 7 Max-temps are helped. total sfu-stalls in shared programs: 30942 -> 30865 (-0.25%) sfu-stalls in affected programs: 207 -> 130 (-37.20%) helped: 120 HURT: 42 Sfu-stalls are helped. total inst-and-stalls in shared programs: 13847990 -> 13841648 (-0.05%) inst-and-stalls in affected programs: 825378 -> 819036 (-0.77%) helped: 4899 HURT: 590 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9404>	2021-03-05 10:32:19 +01:00
Samuel Pitoiset	2169c4f763	radv: re-enable TC-compat HTILE for MSAA D32S8 images on GFX9+ Should help MSAA games. Note that it's broken on GFX8 because the tiling doesn't match. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3868 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9284>	2021-03-05 08:44:40 +00:00
Xin He	97b196b921	virgl: use atomic operations when increase sub_ctx_id Use atomic operations to avoid competition. In addition, since sub_ctx_id 0 has been used by default, sub_ctx_id should start from 1. Signed-off-by: Xin He <hexin.op@bytedance.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9406>	2021-03-05 08:35:29 +00:00
Samuel Pitoiset	367a93830b	radv: skip useless FCE when fast-clearing MSAA images with DCC enabled The clear code is 0xCC which means CMASK isn't fast-cleared. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9392>	2021-03-05 08:11:28 +00:00
Samuel Pitoiset	6102507a74	radv: remove useless check about mips+layers for TC-compat HTILE images radv_use_htile_for_image() prevents it. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9405>	2021-03-05 08:10:19 +01:00
Samuel Pitoiset	438f65fb1e	radv: cleanup enabling TC-compat HTILE for depth surfaces It makes more sense to try to enable TC-compat if the image has HTILE. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9405>	2021-03-05 08:09:42 +01:00
Mike Blumenkrantz	55b57db84d	zink: add vk/spirv caps/extension for shader LAYER variable this is required if gl_Layer is used outside of GEOMETRY stage Fixes: `c77df59c9e` ("zink: export PIPE_CAP_TGSI_VS_LAYER_VIEWPORT") Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9410>	2021-03-05 03:45:51 +00:00
Dave Airlie	1186fbcdf1	lavapipe: fix dynamic viewport/scissor pipeline emission Just fixup the tests for when the pipeline vp/scissors are emitted. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9422>	2021-03-05 03:34:47 +00:00
Dave Airlie	6bcd304278	lavapipe: fix pipeline vp/scissor mixup. Not copying all the scissors caused dEQP-VK.pipeline.extended_dynamic_state.two_draws_dynamic.2_viewports to fail but thah test pointlessly relies on KHR_multiview (cts issue filed). Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Fixes: `b38879f8c5` ("vallium: initial import of the vulkan frontend") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9422>	2021-03-05 03:34:47 +00:00
Iván Briano	194e477615	anv: don't advertise mipmaps for linear 3D surfaces on BDW Prior to SKL, the mipmaps for 3D surfaces are laid out in a way that make it impossible to represent in the way that VkSubresourceLayout expects. Since we can't tell users how to make sense of them, don't report them as available. "Fixes" dEQP-VK.image.subresource_layout.3d.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9419>	2021-03-04 16:23:23 -08:00
Ian Romanick	2c4fd24c01	nir/algebraic: Apply addition property of equality to the other ordering too Inequality comparison operations are not commutative, so `foo < bar` and `bar < foo` both have to be explicitly listed. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> All Intel GPUs had similar results. (Ice Lake shown) total instructions in shared programs: 20027051 -> 20026899 (<.01%) instructions in affected programs: 37181 -> 37029 (-0.41%) helped: 85 HURT: 0 helped stats (abs) min: 1 max: 20 x̄: 1.79 x̃: 1 helped stats (rel) min: 0.05% max: 6.78% x̄: 0.92% x̃: 0.68% 95% mean confidence interval for instructions value: -2.42 -1.15 95% mean confidence interval for instructions %-change: -1.23% -0.61% Instructions are helped. total cycles in shared programs: 979762793 -> 979753527 (<.01%) cycles in affected programs: 2653905 -> 2644639 (-0.35%) helped: 104 HURT: 50 helped stats (abs) min: 1 max: 1048 x̄: 119.99 x̃: 11 helped stats (rel) min: <.01% max: 9.88% x̄: 0.77% x̃: 0.20% HURT stats (abs) min: 1 max: 734 x̄: 64.26 x̃: 8 HURT stats (rel) min: <.01% max: 3.06% x̄: 0.36% x̃: 0.10% 95% mean confidence interval for cycles value: -98.65 -21.68 95% mean confidence interval for cycles %-change: -0.66% -0.15% Cycles are helped. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9374>	2021-03-04 22:50:53 +00:00
Ian Romanick	33031bdab6	nir/algebraic: Apply addition property of equality more conservatively This allows a lot more CSE. Depending on where the addition and the comparison are scheduled, it may also reduce register pressure by reducing the live range of the addends. Across all the platforms, the shaders affected for spills or fills were all fragment shaders from Dirt Rally. Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Tiger Lake and Ice Lake had similar results. (Tiger Lake shown) total instructions in shared programs: 21043103 -> 21038804 (-0.02%) instructions in affected programs: 892878 -> 888579 (-0.48%) helped: 1549 HURT: 724 helped stats (abs) min: 1 max: 225 x̄: 4.14 x̃: 2 helped stats (rel) min: 0.05% max: 11.18% x̄: 1.04% x̃: 0.78% HURT stats (abs) min: 1 max: 71 x̄: 2.93 x̃: 1 HURT stats (rel) min: 0.07% max: 6.90% x̄: 0.80% x̃: 0.56% 95% mean confidence interval for instructions value: -2.33 -1.45 95% mean confidence interval for instructions %-change: -0.50% -0.40% Instructions are helped. total cycles in shared programs: 855054155 -> 855757566 (0.08%) cycles in affected programs: 58275918 -> 58979329 (1.21%) helped: 1213 HURT: 1680 helped stats (abs) min: 1 max: 107405 x̄: 1684.00 x̃: 10 helped stats (rel) min: <.01% max: 38.09% x̄: 1.51% x̃: 0.25% HURT stats (abs) min: 1 max: 126632 x̄: 1634.59 x̃: 12 HURT stats (rel) min: <.01% max: 85.91% x̄: 2.75% x̃: 0.49% 95% mean confidence interval for cycles value: -98.06 584.35 95% mean confidence interval for cycles %-change: 0.71% 1.22% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 9843 -> 9771 (-0.73%) spills in affected programs: 72 -> 0 helped: 5 HURT: 0 total fills in shared programs: 9600 -> 9451 (-1.55%) fills in affected programs: 149 -> 0 helped: 5 HURT: 0 LOST: 14 GAINED: 9 Skylake total instructions in shared programs: 18185074 -> 18183866 (<.01%) instructions in affected programs: 575180 -> 573972 (-0.21%) helped: 1286 HURT: 468 helped stats (abs) min: 1 max: 15 x̄: 1.55 x̃: 1 helped stats (rel) min: 0.03% max: 4.08% x̄: 0.67% x̃: 0.65% HURT stats (abs) min: 1 max: 8 x̄: 1.69 x̃: 1 HURT stats (rel) min: 0.13% max: 7.69% x̄: 0.87% x̃: 0.45% 95% mean confidence interval for instructions value: -0.77 -0.60 95% mean confidence interval for instructions %-change: -0.30% -0.22% Instructions are helped. total cycles in shared programs: 960518105 -> 960608234 (<.01%) cycles in affected programs: 42536073 -> 42626202 (0.21%) helped: 1210 HURT: 1714 helped stats (abs) min: 1 max: 7015 x̄: 123.41 x̃: 10 helped stats (rel) min: <.01% max: 33.76% x̄: 1.32% x̃: 0.26% HURT stats (abs) min: 1 max: 14474 x̄: 139.71 x̃: 14 HURT stats (rel) min: <.01% max: 58.94% x̄: 2.00% x̃: 0.44% 95% mean confidence interval for cycles value: 4.02 57.63 95% mean confidence interval for cycles %-change: 0.43% 0.82% Cycles are HURT. LOST: 16 GAINED: 42 Broadwell total instructions in shared programs: 17856880 -> 17852158 (-0.03%) instructions in affected programs: 564836 -> 560114 (-0.84%) helped: 1243 HURT: 418 helped stats (abs) min: 1 max: 115 x̄: 4.36 x̃: 1 helped stats (rel) min: 0.03% max: 9.67% x̄: 0.90% x̃: 0.67% HURT stats (abs) min: 1 max: 8 x̄: 1.67 x̃: 1 HURT stats (rel) min: 0.14% max: 7.69% x̄: 0.89% x̃: 0.46% 95% mean confidence interval for instructions value: -3.45 -2.23 95% mean confidence interval for instructions %-change: -0.51% -0.38% Instructions are helped. total cycles in shared programs: 1031140321 -> 1029856892 (-0.12%) cycles in affected programs: 66986946 -> 65703517 (-1.92%) helped: 1084 HURT: 1653 helped stats (abs) min: 1 max: 415168 x̄: 1835.32 x̃: 10 helped stats (rel) min: <.01% max: 57.16% x̄: 1.19% x̃: 0.28% HURT stats (abs) min: 1 max: 43930 x̄: 427.14 x̃: 12 HURT stats (rel) min: <.01% max: 57.53% x̄: 1.32% x̃: 0.39% 95% mean confidence interval for cycles value: -915.76 -22.07 95% mean confidence interval for cycles %-change: 0.17% 0.47% Inconclusive result (value mean confidence interval and %-change mean confidence interval disagree). total spills in shared programs: 20891 -> 20335 (-2.66%) spills in affected programs: 1567 -> 1011 (-35.48%) helped: 70 HURT: 0 total fills in shared programs: 27307 -> 25905 (-5.13%) fills in affected programs: 5381 -> 3979 (-26.05%) helped: 71 HURT: 0 LOST: 17 GAINED: 20 Haswell total instructions in shared programs: 16411850 -> 16409414 (-0.01%) instructions in affected programs: 602666 -> 600230 (-0.40%) helped: 1152 HURT: 781 helped stats (abs) min: 1 max: 103 x̄: 3.59 x̃: 1 helped stats (rel) min: 0.03% max: 8.61% x̄: 0.85% x̃: 0.65% HURT stats (abs) min: 1 max: 41 x̄: 2.18 x̃: 1 HURT stats (rel) min: 0.12% max: 7.69% x̄: 0.88% x̃: 0.69% 95% mean confidence interval for instructions value: -1.74 -0.78 95% mean confidence interval for instructions %-change: -0.21% -0.10% Instructions are helped. total cycles in shared programs: 1035338781 -> 1036977801 (0.16%) cycles in affected programs: 68961096 -> 70600116 (2.38%) helped: 1246 HURT: 2206 helped stats (abs) min: 1 max: 392022 x̄: 1040.28 x̃: 14 helped stats (rel) min: <.01% max: 56.44% x̄: 2.32% x̃: 0.38% HURT stats (abs) min: 1 max: 68630 x̄: 1330.56 x̃: 18 HURT stats (rel) min: <.01% max: 69.97% x̄: 3.31% x̃: 0.61% 95% mean confidence interval for cycles value: 90.43 859.17 95% mean confidence interval for cycles %-change: 1.02% 1.54% Cycles are HURT. total spills in shared programs: 17805 -> 17457 (-1.95%) spills in affected programs: 1202 -> 854 (-28.95%) helped: 34 HURT: 31 total fills in shared programs: 20939 -> 20387 (-2.64%) fills in affected programs: 2702 -> 2150 (-20.43%) helped: 34 HURT: 31 LOST: 24 GAINED: 45 Ivy Bridge and earlier Intel GPUs had similar results. (Ivy Bridge shown) total instructions in shared programs: 15515912 -> 15516757 (<.01%) instructions in affected programs: 396569 -> 397414 (0.21%) helped: 578 HURT: 858 helped stats (abs) min: 1 max: 9 x̄: 1.32 x̃: 1 helped stats (rel) min: 0.04% max: 3.70% x̄: 0.65% x̃: 0.65% HURT stats (abs) min: 1 max: 11 x̄: 1.87 x̃: 1 HURT stats (rel) min: 0.08% max: 12.90% x̄: 0.95% x̃: 0.53% 95% mean confidence interval for instructions value: 0.47 0.70 95% mean confidence interval for instructions %-change: 0.24% 0.37% Instructions are HURT. total cycles in shared programs: 584395455 -> 584466352 (0.01%) cycles in affected programs: 20346570 -> 20417467 (0.35%) helped: 1192 HURT: 1896 helped stats (abs) min: 1 max: 4108 x̄: 123.27 x̃: 14 helped stats (rel) min: <.01% max: 37.20% x̄: 2.27% x̃: 0.46% HURT stats (abs) min: 1 max: 3698 x̄: 114.89 x̃: 19 HURT stats (rel) min: <.01% max: 70.28% x̄: 3.02% x̃: 0.71% 95% mean confidence interval for cycles value: 10.75 35.16 95% mean confidence interval for cycles %-change: 0.73% 1.23% Cycles are HURT. LOST: 20 GAINED: 12 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9374>	2021-03-04 22:50:53 +00:00
Kenneth Graunke	206495cac4	iris: Enable u_threaded_context This implements most of the remaining u_threaded_context support. Most of the heavy lifting was done in the previous patches which fixed things up for the new thread safety requirements. Only a few things remain. u_threaded_context support can be disabled via an environment variable: GALLIUM_THREAD=0 On Felix's Tigerlake with the GPU at fixed frequency, enabling u_threaded_context improves performance of several games: - Civilization VI: +17% - Shadow of Mordor: +6% - Bioshock Infinite +6% - Xonotic: +6% Various microbenchmarks improve substantially as well: - GfxBench5 gl_driver2: +58% - SynMark2 OglBatch6: +54% - Piglit drawoverhead: +25% Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	c133d0930f	iris: Use thread safe slab allocators in transfer_map handling pipe->transfer_map can be called from u_threaded_context's thread rather than the driver thread. We need to use two different slab allocators, one for each thread. transfer_unmap, on the other hand, is only ever called from the driver thread. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	1b1c857248	iris: Make various classes inherit from u_threaded_context base classes u_threaded_context requires various objects to inherit from a new threaded_foo base class rather than directly from pipe_foo. This patch does most of the mechanical changes required for that. It also initializes the new threaded_resource fields. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	3358c7125a	iris: Use different shader uploaders for precompile vs. draw time When we enable u_threaded_context, the pipe->create_*_state hooks (precompile variants) are going to be called from one thread, while iris_update_compiled_shaders (on-the-fly variants) are going to be called from a driver thread. BLORP shaders also happen from clear, blit, and so on in the driver thread. u_upload_mgr isn't thread-safe, so use an uploader for each purpose. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	ec0d61c14c	iris: Support rebinding of stream output targets This enables us to replace the backing storage of resources that have been used as stream output targets, in case we're invalidating their entire contents. This can avoid stalls. We simply hadn't supported it because it was going to be tricky to re-emit 3DSTATE_SO_BUFFER without screwing up "reset offset to zero" vs. "keep appending". But that should be working fine with the previous patch's refactor. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	08e04ddd2c	iris: Rework zeroing of stream output buffer offsets The previous mechanism was a bit fragile. We stored the zero offset in the pre-baked packet, and used an flag to override 0xFFFFFFFF (append) offsets until our first emit - then prohibited anyone from trying to re-emit the packet by flagging IRIS_DIRTY_SO_BUFFERS, because that would re-emit the version with the zeroing of the offset. Now, we always store 0xFFFFFFFF in the pre-baked packet, and use a flag to override it to zero on the first emit. That way, we can re-emit that packet at any time, and it'll just keep appending. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:21 -08:00
Kenneth Graunke	e40fafa991	iris: Defer stream output target space allocation until set time In the future, Marek is planning to make u_threaded_context call create_stream_output_target() from a different thread than the main driver thread, which means that we can't safely use uploaders there. To prepare for this eventual future, just defer the allocation of the offset BO 'til later. It's a very small amount of overhead. Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:20 -08:00
Kenneth Graunke	5659460af4	iris: Defer uploading of surface states With u_threaded_context, create_surface and create_sampler_view will be called from a different thread than the driver thread. They aren't allowed to access the context, which means that they can't use the uploaders there to upload our SURFACE_STATE entries. Thanks to backing-storage replacement and iris_rebind_buffer, we already reworked things to maintain CPU-side copies of the SURFACE_STATE entries and added the ability to upload or re-upload them later. So we can skip the upload at object creation time, and add a simple resource-is-NULL check at binding table upload time to ensure that they get uploaded by the time we need them. (They might get uploaded earlier due to rebinds or clear color updates, but this is the last moment to do so.) Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>	2021-03-04 13:59:20 -08:00
Eric Anholt	3bdd39f03c	lima: avoid stomping over bound shader state when creating new shaders It shouldn't affect bound program state, and the current context state shouldn't be relevant for shader creation precompiles anyway (level load isn't going to have the eventual set of sampler views bound when you go to draw with that shader). Closes: #4306 Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9089>	2021-03-04 18:34:35 +00:00
Eric Anholt	4ac3f85054	lima: upload the shader to a BO at shader creation No need to conditionally upload later. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9089>	2021-03-04 18:34:35 +00:00
Eric Anholt	5a550c8dc7	lima: don't look at dirty bits for setup of FS key You always have to populate the key with the right texture swizzles, even if textures haven't changed since binding a new shader. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9089>	2021-03-04 18:34:35 +00:00
Eric Anholt	d4f706389c	lima: stop encoding the texture format in the shader key We can compose the swizzles at sampler view creation time, saving recompiles on texture format changes. Signed-off-by: Erico Nunes <nunes.erico@gmail.com> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9089>	2021-03-04 18:34:34 +00:00
Lionel Landwerlin	8023d6de20	anv: implement INTEL_DEBUG=submit Name all the BOs! v2: Fix 32bit build issue (Thanks Marge!) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5736>	2021-03-04 19:46:24 +02:00
Rohan Garg	c6eb84ff30	virgl: Add support for querying detailed memory info This allows for virgl guests to expose GL_NVX_gpu_memory_info and GL_ATI_meminfo when the extensions are supported on the host. Signed-off-by: Rohan Garg <rohan.garg@collabora.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9337>	2021-03-04 17:14:14 +01:00
Jason Ekstrand	1e53e0d2c7	intel/mi_builder: Drop the gen_ prefix mi_ is already a unique prefix in Mesa so the gen_ isn't really gaining us anything except extra characters. It's possible that MI_ may conflict a tiny bit with GenXML but it doesn't seem to be a problem today and we can deal with that in the future if it's ever an issue. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9393>	2021-03-04 15:14:27 +00:00
Jason Ekstrand	6d522538b6	intel: Rename gen_mi_builder.h to mi_builder.h Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9393>	2021-03-04 15:14:27 +00:00
Danylo Piliaiev	7e25e5b56f	ir3: disallow moving memory writes over discard Writes to global memory should not be moved over discard, otherwise we could have unintended side-effects or lack of side-effects where they should be observed. Fixes tests: dEQP-VK.rasterization.frag_side_effects.color_at_beginning.kill dEQP-VK.rasterization.frag_side_effects.color_at_end.kill Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9365>	2021-03-04 11:40:58 +00:00
Juan A. Suarez Romero	7b3b8524ef	ci: Bump deqp to vk-gl-cts 1.2.5.2 Reviewed-by: Eric Anholt <eric@anholt.net> Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9369>	2021-03-04 11:09:35 +00:00
Danylo Piliaiev	72a9f315db	ir3: make mark_kill_path exit early if instr is already seen Would bring down its complexity in pathological cases. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9386>	2021-03-04 10:52:06 +00:00
Danylo Piliaiev	9dbb678f5a	ir3: prevent duplication of instruction's dependencies Otherwise mark_kill_path() is happy to take exponential time to finish. It was possible to have such chains: ... stib.base0 imm[0.000000,0,0x0], ssa_233, ssa_234, false-deps:ssa_231, ssa_231 stib.base0 imm[0.000000,0,0x0], ssa_237, ssa_238, false-deps:ssa_235, ssa_235 stib.base0 imm[0.000000,0,0x0], ssa_241, ssa_242, false-deps:ssa_239, ssa_239 stib.base0 imm[0.000000,0,0x0], ssa_245, ssa_246, false-deps:ssa_243, ssa_243 stib.base0 imm[0.000000,0,0x0], ssa_249, ssa_250, false-deps:ssa_247, ssa_247 stib.base0 imm[0.000000,0,0x0], ssa_105, ssa_253, false-deps:ssa_251, ssa_251 stib.base0 imm[0.000000,0,0x0], ssa_109, ssa_256, false-deps:ssa_254, ssa_254 stib.base0 imm[0.000000,0,0x0], ssa_113, ssa_259, false-deps:ssa_257, ssa_257 stib.base0 imm[0.000000,0,0x0], ssa_117, ssa_262, false-deps:ssa_260, ssa_260 stib.base0 imm[0.000000,0,0x0], ssa_265, ssa_266, false-deps:ssa_263, ssa_263 stib.base0 imm[0.000000,0,0x0], ssa_269, ssa_270, false-deps:ssa_267, ssa_267 stib.base0 imm[0.000000,0,0x0], ssa_273, ssa_274, false-deps:ssa_271, ssa_271 ... Fixes tests: dEQP-VK.geometry.layered.cube_array.36_36_12.secondary_cmd_buffer_inherit_framebuffer dEQP-VK.geometry.layered.3d.64_64_8.secondary_cmd_buffer_inherit_framebuffer dEQP-VK.geometry.layered.cube_array.64_64_12.secondary_cmd_buffer_inherit_framebuffer Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9386>	2021-03-04 10:52:06 +00:00
Samuel Pitoiset	517600b4d5	Revert "radv: stop using VM_ALWAYS_VALID on APUs" Disabling VM_ALWAYS_VALID actually hurts more than it helps after doing more testing. Managing the global BO list in userspace is really costly and make a bunch of games CPU bound. I think re-enabling VM_ALWAYS_VALID is a step in the right direction. This reverts commit `6ac6e2fbfb`. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9341>	2021-03-04 09:37:59 +00:00
Gert Wollny	e148d5ec99	r600/sfn: lower intrinsic_load_tess_coord to driver version Fixes KHR-GL45.tessellation_shader.tessellation_shader_tessellation.TCS_TES KHR-GL45.tessellation_shader.tessellation_shader_tessellation.TES Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9373>	2021-03-04 09:14:03 +00:00
Gert Wollny	81b41e0c76	nir: Add r600 specific intrinsic for loading the tesselation coords Only the XY pair is provided directly, the Z value has to be deducted from the primitive type. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9373>	2021-03-04 09:14:03 +00:00
cheyang	6f4c4df6c2	virgl: add astc 2d compressed formats Signed-off-by: cheyang <cheyang@bytedance.com> Signed-off-by: hexin <hexin.op@bytedance.com> Reviewed-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9306>	2021-03-04 09:03:47 +00:00
Iago Toral Quiroga	c3732ac0d0	broadcom/compiler: be more aggressive skipping unifa writes We had an optimization in place to skip a unifa write if the address happens to be right after the last ldunifa read address, but we can take this further and update the unifa address by emitting ldunifa instructions if needed to skip a unifa write that is close enough. This is because a unifa write involves 4 cycles: 1 for the write and 3 delay slots before we can emit the first ldunifa. So if we have code like this: unifa addr + 0 ldunifa.r0 unifa addr + 12 ldunifa.r1 In practice we end up with QPU like this: unifa addr + 0 nop nop nop ldunifa.r0 unifa addr + 12 nop nop nop ldunifa.r1 And with this patch we get: unifa addr + 0 nop nop nop ldunifa.r0 <--- reads offset 0 ldunifa.- <--- reads offset 4 ldunifa.- <--- reads offset 8 ldunifa.r1 <--- reads offset 12 Of course, QPU scheduling might find ways to fill the NOPs to some extent and remove some of the gains, but generally speaking, this is still usually a win. Going by shader-db results, allowing the next unifa address to be up to 12 bytes after the address resulting from the last ldunifa read shows the best results: total instructions in shared programs: 13817048 -> 13812202 (-0.04%) instructions in affected programs: 602701 -> 597855 (-0.80%) helped: 1750 HURT: 760 Instructions are helped. total uniforms in shared programs: 3795485 -> 3793200 (-0.06%) uniforms in affected programs: 43930 -> 41645 (-5.20%) helped: 898 HURT: 0 Uniforms are helped. total max-temps in shared programs: 2326612 -> 2326621 (<.01%) max-temps in affected programs: 651 -> 660 (1.38%) helped: 10 HURT: 21 Inconclusive result (value mean confidence interval includes 0). total sfu-stalls in shared programs: 30942 -> 30906 (-0.12%) sfu-stalls in affected programs: 627 -> 591 (-5.74%) helped: 186 HURT: 158 Inconclusive result (value mean confidence interval includes 0). total inst-and-stalls in shared programs: 13847990 -> 13843108 (-0.04%) inst-and-stalls in affected programs: 601404 -> 596522 (-0.81%) helped: 1747 HURT: 757 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9384>	2021-03-04 09:00:15 +01:00
Iago Toral Quiroga	2897a83ff8	broadcom/compiler: drop the destination for unused ldunifa We can't remove unused ldunifa that are not the first or last in a sequence, but we can still ignore their destination to reduce register pressure. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9384>	2021-03-04 09:00:15 +01:00
Timothy Arceri	9d1ef1595c	util/disk_cache: make MESA_DISK_CACHE_READ_ONLY_FOZ_DBS a relative path Rather than passing in full paths this changes things so that we can just pass in filenames relative to the current cache directory. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9279>	2021-03-04 04:07:46 +00:00
Eric Anholt	a8423eb732	ci/turnip: Mark a flaky WSI test. This one has flaked many times at this point, and I've even seen it flake locally. No luck debugging it yet. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9397>	2021-03-03 23:03:48 +00:00
Rob Clark	f8714b2852	freedreno: Remove dead-cells MBR workaround With threaded-context we won't have a chance to apply the workaround in the backend driver. But the previous commit moves it to a driconf configured workaround in mesa/st, so we can drop this now. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9316>	2021-03-03 22:47:59 +00:00
Rob Clark	e6f2e8b3fc	driconf: Add ignore_map_unsynchronized option Add an option to workaround incorrect unsynchronized VBO updates in Dead-Cells. See: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4337 Signed-off-by: Rob Clark <robdclark@chromium.org> Reviewed-by: Eric Anholt <eric@anholt.net> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9316>	2021-03-03 22:47:59 +00:00
Mike Blumenkrantz	3c20b698e2	zink: rewrite macro for getting KHR device functions we have the technology. we can improve our our lives with better macros. Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9398>	2021-03-03 17:27:22 -05:00
Rob Clark	910a2464cf	freedreno/a6xx: Fix compile warning Fixes: `79921b81bc` ("freedreno/a6xx: Document threadsize-related fields") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9394>	2021-03-03 22:09:22 +00:00
Rob Clark	8642456472	freedreno: Deduplicate fixup_shader_state() All the ir3 gens had the same thing, time to move it out into a shared helper. The keeping the storage in fdN_context is to avoid namespace clashes between ir3 and ir2. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9394>	2021-03-03 22:09:22 +00:00
Rob Clark	1611693977	freedreno/ir3: Add comments about shader key/gen I had forgotton on which gens these where used on (which is important if you need to know which shader stages use these).. expand the comments a bit. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9394>	2021-03-03 22:09:22 +00:00
Dave Airlie	bc02fc4823	clover: fix array images view creation Found this on top of Karol's patches but it seems like it can just be applied to master. Helps with some cases of kernel_image_methods/test_kernel_image_methods 2Darray Reviewed-by: Karol Herbst <kherbst@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9381>	2021-03-03 21:59:22 +00:00
Eric Anholt	18be15ad16	ci/zink: Add another primitive restart flake. This one flaked all the way to a run failure in a recent MR of mine. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9396>	2021-03-03 21:49:41 +00:00
Eric Anholt	283a05ddc9	ci/a5xx: Update piglit expectations. The mesa/st shader variants change fixed some fails for us. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9314>	2021-03-03 21:05:39 +00:00
Eric Anholt	957132294f	ci/a5xx: Increase the gles3/31 coverage. Now that there's more time available in our budget per board, we can run all of gles31, and half of gles3, instead of 10%. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9314>	2021-03-03 21:05:39 +00:00
Eric Anholt	1087bf16af	ci/a3xx: Run all of GLES3 dEQP. We're not spending half our time booting any more, so run the other half. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9314>	2021-03-03 21:05:39 +00:00
Eric Anholt	bb82efa792	ci/a5xx: Run all of gles2 in one job. Now that we're not spending so much time on boot overhead, no need to parallelize. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9314>	2021-03-03 21:05:39 +00:00
Eric Anholt	bcdfee3bcd	ci/freedreno: Switch the fastboot boards to using nfsroot. This saves time in packing the rootfs, allows for larger rootfses, and avoids the need for webdav. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9314>	2021-03-03 21:05:39 +00:00
Eric Anholt	e2aff7425d	tgsi_exec: Jump over entirely non-taken THEN or ELSE branches. TGSI has these nice labels for us for where to jump in this case, let's use them. Improves piglit arb_shader_image_load_store-shader-mem-barrier runtime massively, though not enough to make the test really reasonable to run. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9347>	2021-03-03 20:47:08 +00:00
Eric Anholt	3429c83f87	tgsi_exec: Roll the loops for condmask handling. No need to hand-unroll this, the compiler will do it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9347>	2021-03-03 20:47:08 +00:00
Ilia Mirkin	ac6aad3d59	i965: support GL_EXT_color_buffer_half_float FP16 rendering is supported on all gen4 hardware. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9379>	2021-03-03 20:37:03 +00:00
Marek Olšák	a0cc0b3a15	ac/llvm: open code fpow on LLVM 12 using fmul.legacy A quick look at the asm shows that this enables source modifiers (neg, abs) for v_mul_legacy_f32. Totals from affected shaders: SGPRS: 110104 -> 110400 (0.27 %) VGPRS: 57632 -> 57636 (0.01 %) Spilled SGPRs: 66 -> 63 (-4.55 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 3290412 -> 3283068 (-0.22 %) bytes Max Waves: 32141 -> 32141 (0.00 %) Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9395>	2021-03-03 20:06:09 +00:00
Marek Olšák	18c1c1404d	ac/llvm: add type parameter into ac_build_buffer_load to fix 16-bit TES inputs Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9395>	2021-03-03 20:06:09 +00:00
Marek Olšák	ed351b9a71	ac/llvm: fix visit_load_ubo_buffer to use SMEM for 16 bits instead of VMEM This has 3 advantages: - It's SMEM. - Multiple single component loads are merged into 1 multi-dword load by LLVM. - The result is always packed for packed instructions. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9395>	2021-03-03 20:06:09 +00:00
Marek Olšák	46ce67a331	ac/llvm: implement 16-bit and 64-bit fpow correctly LLVM converts to 32 bits and back for llvm.pow, so we can't use it. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9395>	2021-03-03 20:06:09 +00:00
Marek Olšák	3475c79328	ac/llvm: add support for 16-bit source operands for samplers Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9395>	2021-03-03 20:06:09 +00:00
Ian Romanick	c393ae9d84	nir/search: Constify instruction parameter to search helpers The search helps must never modify the instruction passed in, so let the compiler enforce this. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9378>	2021-03-03 18:32:14 +00:00
Lionel Landwerlin	0f437e49c6	anv: fix missing general state pool in validation list Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: `83fee30e85` ("anv: allow multiple command buffers in anv_queue_submit") Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9388>	2021-03-03 18:24:16 +00:00
Eric Anholt	f3f4a24549	ci/lava: Move the driver expectation files to the per-driver CI dir. This will cause less retesting of other drivers when changing the dEQP results for a driver. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9353>	2021-03-03 18:08:11 +00:00
Eric Anholt	9f03ee7773	ci/lava: Move the per-driver gitlab-ci.yml to each driver. Follow-up to !9139, will cause less testing of other drivers when changing the CI configuration for a single driver. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9353>	2021-03-03 18:08:11 +00:00
Samuel Pitoiset	578fc7dbbc	radv: fix RGP barrier layout transition for TC-compatible CMASK images Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9390>	2021-03-03 16:49:29 +00:00
Adam Jackson	69f3d3a29f	zink: Enable GL_EXT_depth_bounds_test Available since Vulkan 1.0, and in fact already wired up, just not advertised. It looks like we could make this dynamic state but this works for now. Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9371>	2021-03-03 16:17:11 +00:00
Rhys Perry	21697082ec	radv: don't shrink image stores for The Surge 2 The game seems to declare the wrong format. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `e4d75c22` ("nir/opt_shrink_vectors: shrink image stores using the format") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4347 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9229>	2021-03-03 14:18:37 +00:00
Rhys Perry	cbb5ed476c	nir/opt_shrink_vectors: add option to skip shrinking image stores Some games declare the wrong format, so we might want to disable this optimization in that case. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Fixes: `e4d75c22` ("nir/opt_shrink_vectors: shrink image stores using the format") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9229>	2021-03-03 14:18:37 +00:00
Danylo Piliaiev	4600dbc6cc	turnip: fix leak of tu_shader object during compute pipeline creation tu_shader should be freed after pipeline is successfully created. Fixes tests: dEQP-VK.api.object_management.alloc_callback_fail.compute_pipeline dEQP-VK.api.object_management.alloc_callback_fail_multiple.compute_pipeline Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9364>	2021-03-03 10:41:29 +00:00
Samuel Pitoiset	b33792b794	radv: bump the initial SQTT buffer size to 32MB per SE Most of the games need 32MB or more, but rarely less. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9367>	2021-03-03 08:40:32 +01:00
Samuel Pitoiset	6813b52290	radv: trigger a new SQTT capture automatically after resizing the buffer It's way better. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9367>	2021-03-03 08:40:32 +01:00
Samuel Pitoiset	0a1e3cc1cb	radv: double the SQTT buffer size when it is resized Computing the expected buffer size isn't reliable on GFX10+ because DROPPED_CNTR returns weird results. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9367>	2021-03-03 08:40:32 +01:00
Samuel Pitoiset	c0608bb083	ac/sqtt: fix determining if the trace is complete on GFX10+ DROPPED_CNTR isn't reliable and might still report non-zero if the SQTT buffer isn't full. Checking if the number of written bytes by the hw is equal to the SQTT buffer size seems reliable. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9367>	2021-03-03 08:40:32 +01:00
Samuel Pitoiset	f4c4c0f207	radv: do not trace inactive shader engines with SQTT This fixes a GPU hang on my Sienna because the number of SE is less than the maximum, and SE #1 is disabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9370>	2021-03-03 08:16:42 +01:00
Mike Blumenkrantz	bc5dcf1527	zink: ci updates Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9291>	2021-03-03 01:37:02 +00:00
Mike Blumenkrantz	587d15ca6c	zink: use staging resource for write transfer_map in order to not stall we can just give the user a staging resource and then flush the data back later Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9291>	2021-03-03 01:37:02 +00:00
Marek Olšák	db67d9c0d1	radeonsi: don't crash on NULL images in si_check_needs_implicit_sync This fixes CTS test: KHR-GL46.arrays_of_arrays_gl.AtomicUsage Fixes: `bddc0e023c` "radeonsi: fix read from compute / write from draw sync" Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9361>	2021-03-03 01:19:24 +00:00
Marek Olšák	f9e6c7a220	ac/llvm: fix ac_build_atomic_rmw with LLVM 13 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4383 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9361>	2021-03-03 01:19:24 +00:00
Eric Anholt	8bd0cc1a5a	nir/vec_to_movs: Don't generate MOVs for undef channels. This appeared in softpipe's image operations, since NIR always uses 4-component values for the coords, while the GLSL IR only has 2 components for a 2D image (for example). arb_shader_image_load_store-shader-mem-barrier (which times out in CI and spends its time inside of tgsi_exec) was spending 4/51 of its instructions on moving these undefs around. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9345>	2021-03-03 00:51:44 +00:00
Eric Anholt	1e5ef4c60c	nir: Add a nir_src_is_undef() helper, like nir_src_is_const(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9345>	2021-03-03 00:51:44 +00:00
Mike Blumenkrantz	c77df59c9e	zink: export PIPE_CAP_TGSI_VS_LAYER_VIEWPORT Acked-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9283>	2021-03-02 17:42:00 -05:00
Mike Blumenkrantz	ffd046cf32	zink: enable PIPE_CAP_CLEAR_SCISSORED Acked-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9283>	2021-03-02 17:42:00 -05:00
Dave Airlie	abc724e440	lavapipe: sort bindings before creating descriptor set This ensures the dynamic offsets are correct Fixes: `b38879f8c5` ("vallium: initial import of the vulkan frontend") Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9359>	2021-03-03 08:06:02 +10:00
Dave Airlie	0a939e788f	lavapipe: reorder descriptor set stages to get correct binding The fragment stage was in the wrong place here. Fixes: `b38879f8c5` ("vallium: initial import of the vulkan frontend") Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9359>	2021-03-03 08:02:16 +10:00
Ian Romanick	7ca3e90c18	gallium/dri: Remove dri2_format_mapping::cpp I was suspicious that some entries in dri2_format_table (in dri_helpers.c) had this field set incorrectly. It seemed like DRM_FORMAT_ABGR16161616F and DRM_FORMAT_XBGR16161616F should have been 8 instead of 4. Upon digging I found that nothing uses the field. Fix code by removing it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9354>	2021-03-02 19:42:04 +00:00
Karol Herbst	f0dccd9578	clover: Add missing include for llvm-12 build fix Fixes: `d1eab2b1eb` ("clover: Fix build with llvm-12.") Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9372>	2021-03-02 19:35:40 +00:00
Mike Blumenkrantz	1294aec650	zink: apply only the pending zs clear bits during deferred clears both bits will have been flagged at this point in order to indicate that the aspects will be cleared "at some point" during the loop, but when actually iterating through the pending clears, only the bits set in the clear call should be applied Fixes: `5c629e9ff2` ("zink: defer pipe_context::clear calls when not currently in a renderpass") Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9366>	2021-03-02 19:24:52 +00:00
Axel Davy	e891f039da	st/nine: Simplify checks for driconf options Remove the useless driCheckOption calls. They always succeed. As a result the intended behaviour for thread_submit was not working (different default depending on the gpu used). Add a comment to fix that in the future. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:08 +01:00

... 5 6 7 8 9 ...

126030 Commits