KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Jesse Natalie	cc8219d1b4	d3d12: Enable compute Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	f399378c52	d3d12: Run DXIL shared atomic lowering pass Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	9f67f432d7	d3d12: Handle indirect dispatch Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	9cc6b17c8a	d3d12: Implement num workgroups as a state var Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	65a16a568c	d3d12: Implement launch_grid Some more refactoring in d3d12_draw.cpp to re-use a bunch of state and descriptor management, and some refactoring of the dirty states. Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	570a042a94	d3d12: Hook up compute shader variations Currently only variable workgroup size is implemented Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	5f23b1d7cd	d3d12: Support compute root signatures Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	6d38a35afb	d3d12: Compile, bind, and cache compute PSOs Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	e350d1ab09	d3d12: Stop trying to set D3D12_DIRTY_SHADER during bindings We don't key off of it to try to figure out if we need to produce a new shader variant, so there's no need to set it when changing properties that feed into variants. If we do have a new shader or variant at draw time, we'll produce a new PSO without this. Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	944c72ae4d	d3d12: Remove draw_info from selection_context It's not needed, and having it there can be misleading since sometimes it's null Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	fbc1d90f19	d3d12: Keep state vars last in the per-stage root parameters Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	166cd05071	d3d12: Limit sampler view count to 32 Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	2837e67b9b	microsoft/compiler: Handle more GL memory barriers Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Jesse Natalie	fd50ef046b	microsoft/compiler: Move workgroup_size lowering from clc It doesn't depend on the clc data being provided externally, so no need to tie it there, we can re-use it for GL and Vulkan compute. Reviewed-by: Sil Vilerino <sivileri@microsoft.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14367>	2022-01-11 01:36:56 +00:00
Rob Clark	5e18aafd26	freedreno: Report system memory as video memory This seems to be the approach that other UMA drivers have settled on, when there aren't some other constraints. Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5675 Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14478>	2022-01-11 01:15:31 +00:00
Emma Anholt	3563ae4b2d	nir_to_tgsi: Fix a bug in TXP detection after backend lowering. TGSI reserves 2 components for the coord in the first operand vector, even for 1D. Fixes r600 failure with shadow1d. Fixes: `390a3fcdc4` ("nir_to_tgsi: Add support for TXP.") Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14322>	2022-01-11 00:53:39 +00:00
Francisco Jerez	8e21cad39b	intel/xehp: Implement XeHP workaround Wa_14014148106. Actually, no, there's no need to do anything, just update some comments for the record. An earlier revision of this change that implemented the workaround text to the letter required no less than 8 new PIPE_CONTROLs throughout the tree. However Felix Degrood noticed that the cost of some of the PIPE_CONTROLs was showing up in workloads like Shadow of the Tomb Raider. The Windows driver wasn't emitting many of those pipe controls, contrary to the W/A instructions, so we engaged in a back and forth with the hardware team, who concluded that the original suggested workaround was unnecessarily strict, and the Windows driver's behavior acceptable. It turns out that Wa_1408224581 we had already implemented for TGL is roughly equivalent to the Windows behavior, so no need to do anything new after all. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14278>	2022-01-11 00:17:32 +00:00
Francisco Jerez	eeb3f4594d	intel/xehp: Implement XeHP workaround Wa_14013910100. XeHP platforms require the invalidation of the instruction cache after a STATE_BASE_ADDRESS change due to a hardware bug potentially leading to instruction cache pollution. Note that the workaround text says it's applicable "DG2 128/256/512-A/B", however it's also marked as permanent and not confirmed to be fixed in any specific steping, so we apply it to all Gfx12HP platforms. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rohan Garg <rohan.garg@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14278>	2022-01-11 00:17:32 +00:00
Alyssa Rosenzweig	b550b3c89c	vc4: Use u_box_pixels_to_blocks helper Eliminates a ETC1 special case. In fact this unit conversion applies to all formats; the original code path works since ETC1 is the only format with blocks bigger than 1x1 supported by vc4 (I assume). Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>	2022-01-10 23:16:56 +00:00
Alyssa Rosenzweig	6f07159a1d	v3d: Use u_box_pixels_to_blocks helper Rather than open-coding. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Reviewed-by: Juan A. Suarez <jasuarez@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>	2022-01-10 23:16:56 +00:00
Alyssa Rosenzweig	b920ace4bc	lima,panfrost: Correct pixel vs block mismatches Different parts of our codebase disagree on whether spatial coordinates/dimensions are given in pixels or blocks, which differ by a constant factor for block-compressed formats. This disagreement manifests as incorrect results accessing block-compressed formats. To resolve this, define the public tiling routines to take their coordinates in pixels, and align the relevant code in Panfrost accordingly. Fixes rendering glitches in Factorio, as well as a pile of piglits on Panfrost. It should also fix glTexSubImage() with ETC1 on Lima, but there are no tests for this in dEQP/Piglit. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> Tested-by: Vasily Khoruzhick <anarsoul@gmail.com> [dEQP/Lima] Tested-by: Erico Nunes <nunes.erico@gmail.com> [Piglit/Lima] Reported-by: Icecream95 <ixn@disroot.org> Closes: #5560 Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>	2022-01-10 23:16:56 +00:00
Alyssa Rosenzweig	26c533f167	gallium/util: Add pixel->blocks box helper There is a lot of unit confusion in Gallium due to pixels versus blocks matching only with uncompressed textures. Add a helper to do a common pixels->blocks unit conversion required in multiple drivers. v2: Rename dst->blocks, src->pixels to avoid confusion about the units to casual readers (Mike). Note to mesa-stable maintainers: this is marked as Cc: mesa-stable so the next patch (a set of bug fixes for Lima and Panfrost) can be backported. It's not a bug fix in its own right, of course. Signed-off-by: Alyssa Rosenzweig <alyssa@collabora.com> Reviewed-by: Emma Anholt <emma@anholt.net> [v1] Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14370>	2022-01-10 23:16:56 +00:00
Thomas H.P. Andersen	7daba1fe65	replace 0 with NULL for NULL pointers This updates many places where 0 is used as NULL pointer. There are a few warnings left when I build the default configuration but they either relate to code outside of mesa or where "None" is used instead. Found with static analysis (smatch) Reviewed-by: Matt Turner <mattst88@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12174>	2022-01-10 22:53:32 +00:00
Rhys Perry	60c711833f	aco: remove pack_half_2x16(a, 0) optimization This makes the compiler less predictable and should only have a very small effect on performance. fossil-db (Vega): Totals from 2410 (1.79% of 134756) affected shaders: CodeSize: 6911568 -> 6942840 (+0.45%) Fixes Horizon Zero Dawn artifacts. If a shader has: a = pack_half_2x16(a, 0) //rtne store(pack_half_2x16(0, b) \| a) //rtne a = unpack_2x16(a).x It will become: store(pack_half_2x16(a, b)) //rtz a = unpack_2x16(pack_half_2x16(a, 0)).x //rtne So a later shader with "unpack_2x16(load()).x" will use "a" rounded to zero, while the previous shader will use "a" rounded to the nearest even. Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `2f125908b3` ("radv,aco: lower_pack_half_2x16") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14475>	2022-01-10 22:19:29 +00:00
Christian Gmeiner	6e08d8fc3d	ci: Uprev piglit to af1785f31 Brings in these changes: af1785f31 occlusion_query_conform: skip GetQueryCounterBits test if needed dad078717 occlusion_query_conform: convert to pilgit subtests b52c1c761 glsl-1.30: test nested preprocessor concat 6c4da153b texture-storage: Fix subtest result handling of skips. 4343f19db fbo-integer: Remove the invalid DrawPixels test. e3842f2fe arb_dsa: exclude stencil8 textures from test sets. ce8649be7 spec/ext_external_objects: Fix build on Debian systems 4e553838f glsl: add basic tests for desktop GLSL invariant qualifier linking 7e61e5199 Tests for variable in and out of loop scope f855ad1c8 fbo-mrt-alphatest: Only require GLSL 1.20 9be2fe999 glx: add glx-multi-display-single-pbuffer test bfe290725 glx: add glx-swap-pbuffer test efa64335e framework: Fix build on Windows when using waffle Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14468>	2022-01-10 21:52:42 +00:00
Jordan Justen	0fc93928f1	isl: Don't enable HDC:L1 caches on DG2 The MOCS entry used for this on Tigerlake doesn't exist on DG2. Ref: `aca31baafc` ("isl: Enable Tigerlake HDC:L1 caches via MOCS in various cases.") Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14467>	2022-01-10 21:20:03 +00:00
Rhys Perry	67fc7a1763	nir/uniform_atomics: fix is_atomic_already_optimized without workgroups dims_needed would have been zero, so this would always returned true for non-compute stages. Also fix this for variable workgroup sizes. Improves Shadow of the Tomb Raider RX 6800 performance by 10.6%, 11.5% and 4.5% (day_of_dead, jungle and paititi scenes). radv_perf before and after: {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '62.913333333333334', 'min_fps': '62.81', 'max_fps': '62.98', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '64.02666666666666', 'min_fps': '63.93', 'max_fps': '64.11', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '74.81666666666666', 'min_fps': '74.72', 'max_fps': '74.88', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '69.57', 'min_fps': '69.52', 'max_fps': '69.63', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '71.41000000000001', 'min_fps': '71.31', 'max_fps': '71.5', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '78.16666666666667', 'min_fps': '78.07', 'max_fps': '78.23', 'interations': '3'} Performance now seems slightly better than AMDVLK 2021.Q4.3: {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'day_of_dead', 'avg_fps': '68.02666666666666', 'min_fps': '67.95', 'max_fps': '68.16', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'jungle', 'avg_fps': '70.24666666666667', 'min_fps': '69.83', 'max_fps': '70.51', 'interations': '3'} {'app': 'SotTR', 'resolution': '3840x2160', 'preset': 'VeryHigh', 'antialiasing': 'off', 'scene': 'paititi', 'avg_fps': '77.19', 'min_fps': '77.18', 'max_fps': '77.2', 'interations': '3'} fossil-db (Sienna Cichlid): Totals from 40 (0.03% of 134621) affected shaders: CodeSize: 62676 -> 65996 (+5.30%) Instrs: 11372 -> 12111 (+6.50%) Latency: 144122 -> 142848 (-0.88%); split: -1.09%, +0.21% InvThroughput: 19686 -> 19847 (+0.82%); split: -0.06%, +0.87% VClause: 304 -> 306 (+0.66%) SClause: 603 -> 604 (+0.17%); split: -0.83%, +1.00% Copies: 780 -> 858 (+10.00%) Branches: 235 -> 329 (+40.00%) PreSGPRs: 1072 -> 1083 (+1.03%); split: -0.37%, +1.40% Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14407>	2022-01-10 19:57:38 +00:00
Konstantin Seurer	aaa90c37e0	panvk: Fixed maxFragmentCombinedOutputResources Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>	2022-01-10 19:28:17 +00:00
Konstantin Seurer	651bec0971	turnip: Fixed maxFragmentCombinedOutputResources Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>	2022-01-10 19:28:17 +00:00
Konstantin Seurer	e0d590cafb	anv: Fixed maxFragmentCombinedOutputResources Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>	2022-01-10 19:28:17 +00:00
Konstantin Seurer	2b5cf84efd	lavapipe: Fixed maxFragmentCombinedOutputResources Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14320>	2022-01-10 19:28:17 +00:00
Rhys Perry	0f5d90c2a7	ac/nir: fix store_buffer_amd write_masks Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>	2022-01-10 19:01:04 +00:00
Rhys Perry	b00138090e	nir/lower_shader_calls: fix store_scratch write_mask Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14447>	2022-01-10 19:01:04 +00:00
Lucas Stach	d799a4be27	etnaviv: drm: defer destruction of softpin BOs When destroying a BO with a userspace managed address and thus freeing the VMA space, we need to make sure that the BO isn't in use by any active submit anymore, as the kernel will rightfully reject the next submit that re-uses the still active VMA. Keep the BO alive as long as it isn't fully idle to prevent the VMA being reused prematurely. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>	2022-01-10 16:49:00 +00:00
Lucas Stach	98a2049c08	etnaviv: drm: rename _etna_bo_del Rename it to a somwhat more descriptive name, which makes it easier to distinguish between the etna_bo_del function in the public interface and the internal function. Also remove the duplicated forward declaration and move it to the common interal header. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>	2022-01-10 16:49:00 +00:00
Lucas Stach	77ebbcbf9a	etnaviv: drm: export BO idle check function The ability to check if a BO is idle is not only useful in the buffer cache, but also in other parts of the winsys and even the pipe driver. Make this functionality available in the interface. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>	2022-01-10 16:49:00 +00:00
Lucas Stach	1b1f8592c0	etnaviv: drm: properly handle reviving BOs via a lookup If a BO is removed from a cache bucket list via a lookup, we must handle it in the same way as if a allocation from the cache happened: tell valgrind that the buffer is active again and take a reference to the etna_device, which the BO had given up while being in the cache. Cc: mesa-stable Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Tested-by: Guido Günther <agx@sigxcpu.org> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14159>	2022-01-10 16:49:00 +00:00
Lucas Stach	ccfd5054a4	etnaviv: drm: fix size limit in etna_cmd_stream_realloc The intended limit for command stream size is 64KB, as this is what old kernels can reliably do and what allows for maximum number of queued streams on newer kernels. However, due to unit confusion with the size member, which is in dwords, the submitted streams could grow up to ~128KB. Fix this by using the proper limit in dwords. Flushing due to some limits being exceeded is not an issue, but is expected with certain workloads, so lower the severity of the message being emitted in this case to debug level. Cc: mesa-stable Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14425>	2022-01-10 15:38:27 +00:00
Lucas Stach	22d796feb8	egl/wayland: break double/tripple buffering feedback loops Currently we dispose any unneeded color buffers immediately if we detect that there are more unlocked buffers than we need. This can lead to feedback loops between the compositor and the application causing rapid toggling between double and tripple buffering. Scenario: 2 buffers already queued to the compositor, egl/wayland allocates a new back buffer to avoid throttling, slowing down the frame. This allows the compositor to catch up and unlock both buffers. EGL detects that there are more buffers than currently needed, freeing the buffer, restarting the loop shortly after. To avoid wasting CPU time on rapidly freeing and reallocating color buffers break those feedback loops by letting the unneeded buffers sit around for a short while before disposing them. Signed-off-by: Lucas Stach <l.stach@pengutronix.de> Reviewed-by: Simon Ser <contact@emersion.fr> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14451>	2022-01-10 15:11:44 +00:00
Danylo Piliaiev	d77bfc117c	tu,ir3: Implement VK_KHR_shader_integer_dot_product - gen4 - has dp4acc and dp2acc, dp4acc is used to implement 4x8 dot product. - gen3 - has dp2acc, in OpenCL blob uses dp2acc for dot product on both get3 and gen4. - gen2 - unknown, lower everything. - gen1 - no dp2acc, lower everything. OpenCL blob doesn't advertise cl_qcom_dot_product8 but still generates code for it. The assembly is more verbose and uses yet to be documented mad32.u16 instruction. Passes: dEQP-VK.spirv_assembly.instruction.compute.opsdotkhr.* dEQP-VK.spirv_assembly.instruction.compute.opudotkhr.* dEQP-VK.spirv_assembly.instruction.compute.opsudotkhr.* dEQP-VK.spirv_assembly.instruction.compute.opsdotaccsatkhr.* dEQP-VK.spirv_assembly.instruction.compute.opudotaccsatkhr.* dEQP-VK.spirv_assembly.instruction.compute.opsudotaccsatkhr.* Only packed 4x8 unsigned and mixed versions are accelerated. However in theory we should be able to do better for signed version than current NIR lowering. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>	2022-01-10 13:21:24 +02:00
Danylo Piliaiev	e1f89a1da2	ir3: Make nir compiler options a part of ir3_compiler This would allow for sub-gens to have different options. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>	2022-01-10 13:20:39 +02:00
Danylo Piliaiev	b8d486f298	nir/algebraic: Separate has_dot_4x8 into has_sdot_4x8 and has_udot_4x8 Adreno GPUs has native instruction for unsigned and mixed dot_4x8 but not signed dot product. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Emma Anholt <emma@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>	2022-01-10 13:20:39 +02:00
Danylo Piliaiev	c1d5c318bc	ir3: New cat3 instructions * shrm - (src2 >> src1) & src3 * shlm - (src2 << src1) & src3 * shrg - (src2 >> src1) \| src3 * shlg - (src2 << src1) \| src3 * andg - (src2 & src1) \| src3 * dp2acc - dot product of two {i,u}8vec2 packed into SRC1 and SRC2, added to 32b SRC3 * dp4acc - dot product of two {i,u}8vec4 packed into SRC1 and SRC2, added to 32b SRC3 * wmm - vec4(x_1, x_2, x_3, x_4) * (y_1 + y_2 + y_3 + y_4), which is duplicated (1 << (SRC3 / 32)) times starting from DST register * wmm.accu - same as wmm but result is added to DST registers, however the first reg in each vec4 result is overwritten instead of accumulating. Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13986>	2022-01-10 13:20:39 +02:00
Connor Abbott	c45c6e36eb	tu: Implement VK_EXT_subgroup_size_control Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>	2022-01-10 10:58:28 +00:00
Connor Abbott	1a1e25dcce	tu, ir3: Support runtime gl_SubgroupSize in FS We already supported it in the CS for computing the subgroup ID, but soon we'll need it in the FS too. Vertex stages will always have it lowered. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>	2022-01-10 10:58:28 +00:00
Connor Abbott	e6e34883a9	ir3: Add wavesize control This allows the wavesize to be controlled per-shader. This will be used by VK_EXT_subgroup_size_control, and freedreno will also need it if legacy ARB_shader_ballot is to be supported (since it forces a wavesize of 64 or less). Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>	2022-01-10 10:58:28 +00:00
Connor Abbott	30237b3d9c	ir3: Pass shader to ir3_nir_post_finalize() We'll need to add shader-specific lowering for gl_SubgroupSize. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>	2022-01-10 10:58:28 +00:00
Connor Abbott	9ebc48005c	ir3, freedreno: Add options struct for ir3_shader_from_nir() We'll expand this in a moment. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13960>	2022-01-10 10:58:28 +00:00
Danylo Piliaiev	fe9c9ec83f	tu: fix workaround for depth bounds test without depth test Fixes: `bb4db22ff4` ("turnip: apply workaround for depth bounds test without depth test") Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14390>	2022-01-10 09:36:59 +00:00
Lionel Landwerlin	07bc6b7ed9	anv: limit compiler valid color outputs using NIR variables This fixes a test from the vkd3d-proton test_dual_source_blending_dxbc test which asserts in the backend with : brw_fs_visitor.cpp:716: void fs_visitor::emit_fb_writes(): Assertion `!prog_data->dual_src_blend \|\| key->nr_color_regions == 1' failed. This is because there is 2 color attachments provided by the renderpass so we initially set nr_color_regions = 2. But once we've parsed the shader, we can see it's only using one output (with dual source color blending). This change looks at the output variables to update the valid output variables. Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Tapani Pälli <tapani.palli@intel.com> Cc: mesa-stable Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14417>	2022-01-10 09:38:32 +02:00

... 3 4 5 6 7 ...

148874 Commits All Branches Search

148874 Commits

All Branches