KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Mike Blumenkrantz	bc5dcf1527	zink: ci updates Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9291>	2021-03-03 01:37:02 +00:00
Mike Blumenkrantz	587d15ca6c	zink: use staging resource for write transfer_map in order to not stall we can just give the user a staging resource and then flush the data back later Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9291>	2021-03-03 01:37:02 +00:00
Marek Olšák	db67d9c0d1	radeonsi: don't crash on NULL images in si_check_needs_implicit_sync This fixes CTS test: KHR-GL46.arrays_of_arrays_gl.AtomicUsage Fixes: `bddc0e023c` "radeonsi: fix read from compute / write from draw sync" Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9361>	2021-03-03 01:19:24 +00:00
Marek Olšák	f9e6c7a220	ac/llvm: fix ac_build_atomic_rmw with LLVM 13 Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4383 Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9361>	2021-03-03 01:19:24 +00:00
Eric Anholt	8bd0cc1a5a	nir/vec_to_movs: Don't generate MOVs for undef channels. This appeared in softpipe's image operations, since NIR always uses 4-component values for the coords, while the GLSL IR only has 2 components for a 2D image (for example). arb_shader_image_load_store-shader-mem-barrier (which times out in CI and spends its time inside of tgsi_exec) was spending 4/51 of its instructions on moving these undefs around. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9345>	2021-03-03 00:51:44 +00:00
Eric Anholt	1e5ef4c60c	nir: Add a nir_src_is_undef() helper, like nir_src_is_const(). Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9345>	2021-03-03 00:51:44 +00:00
Mike Blumenkrantz	c77df59c9e	zink: export PIPE_CAP_TGSI_VS_LAYER_VIEWPORT Acked-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9283>	2021-03-02 17:42:00 -05:00
Mike Blumenkrantz	ffd046cf32	zink: enable PIPE_CAP_CLEAR_SCISSORED Acked-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9283>	2021-03-02 17:42:00 -05:00
Dave Airlie	abc724e440	lavapipe: sort bindings before creating descriptor set This ensures the dynamic offsets are correct Fixes: `b38879f8c5` ("vallium: initial import of the vulkan frontend") Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9359>	2021-03-03 08:06:02 +10:00
Dave Airlie	0a939e788f	lavapipe: reorder descriptor set stages to get correct binding The fragment stage was in the wrong place here. Fixes: `b38879f8c5` ("vallium: initial import of the vulkan frontend") Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9359>	2021-03-03 08:02:16 +10:00
Ian Romanick	7ca3e90c18	gallium/dri: Remove dri2_format_mapping::cpp I was suspicious that some entries in dri2_format_table (in dri_helpers.c) had this field set incorrectly. It seemed like DRM_FORMAT_ABGR16161616F and DRM_FORMAT_XBGR16161616F should have been 8 instead of 4. Upon digging I found that nothing uses the field. Fix code by removing it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9354>	2021-03-02 19:42:04 +00:00
Karol Herbst	f0dccd9578	clover: Add missing include for llvm-12 build fix Fixes: `d1eab2b1eb` ("clover: Fix build with llvm-12.") Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9372>	2021-03-02 19:35:40 +00:00
Mike Blumenkrantz	1294aec650	zink: apply only the pending zs clear bits during deferred clears both bits will have been flagged at this point in order to indicate that the aspects will be cleared "at some point" during the loop, but when actually iterating through the pending clears, only the bits set in the clear call should be applied Fixes: `5c629e9ff2` ("zink: defer pipe_context::clear calls when not currently in a renderpass") Reviewed-by: Dave Airlie <airlied@redhat.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9366>	2021-03-02 19:24:52 +00:00
Axel Davy	e891f039da	st/nine: Simplify checks for driconf options Remove the useless driCheckOption calls. They always succeed. As a result the intended behaviour for thread_submit was not working (different default depending on the gpu used). Add a comment to fix that in the future. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:08 +01:00
Axel Davy	642e19dc44	driconf: Rename csmt_int back to csmt_force Fixes regression introduced by <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6916> Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Axel Davy	7a1a1fc5d9	st/nine: Fix leak at device destruction At the release of the last object holding a reference on the device, the device dtor was executed and the objector dtor was ignored. The proper way is to execute the object dtor, then the device dtor. The previous code was likely for a workaround against something that was fixed since. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Axel Davy	d730f8d7a9	st/nine: Protect PrivateData also for Volumes PrivateData functions were not protected by a mutex for Volumes whereas they definitely should. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Axel Davy	b383b1e01a	st/nine: Refactor ht_guid_delete Have ht_guid_delete take a hash_entry. As a result, we can use _mesa_hash_table_remove instead of _mesa_hash_table_remove_key. The previous code using the latter was incorrect as the key of the entry was read after it was freed. Fixes: https://github.com/iXit/wine-nine-standalone/issues/40 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Axel Davy	501ad0e134	st/nine: Add new debug and error checks Add new debug messages and error checks Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Axel Davy	1a53099909	st/nine: Enable DF24 support We can enable it, now that FETCH4 is implemented. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Axel Davy	1357d2a60a	st/nine: Implement experimental FETCH4 FETCH4 is a feature that needs to be implemented to advertise D3DFMT_DF24. It's basically a variant of Gather4. This first implementation will need to be completed to implement the feature fully, but the feature doesn't seem to be much used (other equivalent features are preferred by games). Note until DF24 is advertised, apps are not supposed to use FETCH4. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Axel Davy	d097bdcc78	st/nine: Track formats compatible with FETCH4 FETCH4 is a d3d9 extension not much used, as newer ones were prefered. However it's support is required to advertise the DF24 format. Prepares support by tracking compatible formats. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Axel Davy	6a3451e170	st/nine: Unmap buffers after full unlock Do not unmap anything until all buffer unlocks were received. A buffer can be filled in several threads, and thus in the case of double locks, it's not possible to know which unlock is received first. Thus only unmap the buffers when the last unlock is received. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Axel Davy	3dd6b79215	st/nine: Clamp GetAvailableTextureMem Previously we used to clamp "available_texture_limit", which was incorrect. "available_texture_mem" should have been clamped instead. The resulting code was noop. The idea behind that code was that 32 bits executable would see maximum 4GB video memory. However it seems according to users that 32 bits apps should be able to allocate more than 4GB, thus the clamping is inappropriate. Instead clamp the return of GetAvailableTextureMem, to correctly report a high value when there is more than 4GB available. I do not know what should exactly be the clamp value, for now have a 64MB margin below UINT_MAX. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Axel Davy	f85f025a05	st/nine: Do not allow depth buffer render targets Without the proposed check, some apps will decide to use depth buffers as render targets. Bug found investigating: https://github.com/iXit/wine-nine-standalone/issues/82 Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Axel Davy	3dbc542f97	st/nine: Reduce system memory allocated by D3DUSAGE_AUTOGENMIPMAP For D3DUSAGE_AUTOGENMIPMAP basically, everything behaves for the application as if the texture had one level. However the pipe_resource has more levels, and those get generated automatically. Previously we did allocate all the Surfaces as if the texture had all the levels, except of just one. The app could still just access the first level. This patch completly removes the useless unaccessible Surfaces. In addition removes redundant handling of D3DUSAGE_AUTOGENMIPMAP. Signed-off-by: Axel Davy <davyaxel0@gmail.com> Acked-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9177>	2021-03-02 20:07:07 +01:00
Gert Wollny	ec74a13618	r600/sfn: Update status Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	43816d20dd	r600: Enable GLSL 450 for nir shaders. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	4d91812d3c	r600: Don't optimize using source modifiers on literals The code improvement is limited and it interferes with using literals directly in LDS index ops, since here source modifiers are not supported, but the current assembler code might inject the modifiers. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	49b0e8657e	r600/sfn: Fix loading TES gl_PatchVerticesIn Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	bd57bf6d82	r600/sfn: handle querying the number of layers in cube arrays This has to be loaded from a constant buffer instead of the actual texture. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	935d9e6863	nir: disaallow reordering for r600 shared load and remove component field The original shared load op can't be reordered, so it might be better to also not allow this for the lowered variant. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Acked-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	d1ccf4a0ee	r600/sfn: encode component in address for local IO The backend code was actually assuming this, but the lowering still set the components and write masks like it would be honoured. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	c0c025c870	r600/sfn: remove some old debug output Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	b07992c4dc	r600/sfn: remove unused emit_alu_op2_split_src_mods Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	ddc5c99402	r600/sfn: remove code for nir_op_fsign since it is lowered Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	4fe0339941	r600: unify nir shader options evaluation Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	04d8d455b7	r600/sfn: Allow any channel for the helper invocation evaluation Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	911c6af2fd	r600/sfn: lower isign and iabs in nir Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	7d94d759fa	r600/sfn: set info about using helper_invocation to skip sb sb can't handle helper invocations, so skip sb when it is used. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Gert Wollny	c427ed7ffe	r600/sfn: Lower FS inputs to temps late and, and lower interpolate at This fixes FS shaders where a var is loaded with two different interpolators. Signed-off-by: Gert Wollny <gert.wollny@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>	2021-03-02 18:46:17 +01:00
Jose Fonseca	3ba7784b1e	util: Always use timespec_get on Windows. include/c11/threads_win32.h provides a fallback implementation of timespec_get when necessary. Fixes https://gitlab.freedesktop.org/mesa/mesa/-/issues/4109 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9280>	2021-03-02 14:37:46 +00:00
Rhys Perry	3a72044ece	aco: add missing usable_read2 check A Hitman 2 shader does: read64(local_invocation_index() * 4 - 4). This was likely emitting a ds_read2_b32 on GFX6. For local_invocation_index()=0, because the first dword was out-of-bounds, the second was likely also considered out-of-bounds (even though it's not, at offset 0). Likely fixes https://gitlab.freedesktop.org/mesa/mesa/-/issues/3882 Signed-off-by: Rhys Perry <pendingchaos02@gmail.com> Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Fixes: `57e6886f98` ("aco: refactor load_lds to use new helpers") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9332>	2021-03-02 13:13:59 +00:00
Rhys Perry	941739619e	Revert "radv,aco: allow unaligned LDS access on GFX9+" This reverts commit `1a0b0e8460`. The bounds checking behaviour of ds_read_b64, ds_read_b96 and ds_read_b128 make this feature very difficult to use safely. This fixes a blocking artifact in Hitman 2. Previously, it contained: ds_read_b64(local_invocation_index() * 4 - 4) For local_invocation_index()=0, the second dword would be considered out-of-bounds, even though it's at offset 0. Reviewed-by: Daniel Schürmann <daniel@schuermann.dev> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9332>	2021-03-02 13:13:59 +00:00
Iago Toral Quiroga	acbd4881c2	broadcom/compiler: ldvary pipelining tracking and documentation clean-ups Now that we can pipeline all varyings we should not be referring specifically to smooth varyings anywhere. Also, rename the instruction field 'ldvary_pipelining' to 'is_ldvary_sequence', which is more appropriate, since we always set this for any instruction involved with varying setups, independently of whether they end up being pipelined or not. This also does some other minor edits which intend to slightly simplify the code and make it a bit more compact. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9363>	2021-03-02 13:54:14 +01:00
Kenneth Graunke	a48151ffad	glsl/float64: Bump #version to 400 An earlier commit tried to make this shader compatible with GLSL 3.30, but it requires, GL_ARB_gpu_shader_int64, which requires GLSL 4.00 and GL 4.0 according to the extension spec. So we were failing to enable the required extension, breaking compilation of this shader. The original intention of that patch was to get this working on zink, which at the time only supported GL 3.3. But now it supports later OpenGL versions, so we don't need to do this any longer. Rather than revert the patch and raise the version all the way back to 430, just bump it to the require 400 at Ian Romanick's suggestion. Fixes: `4d47b22bf0` ("glsl/float64: make this compatible with glsl 330") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3991 Reviewed-by: Ian Romanick <ian.d.romanick@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9351>	2021-03-02 09:30:24 +00:00
Karol Herbst	d1eab2b1eb	clover: Fix build with llvm-12. Fix build error after LLVM commit c495dfe0268b ("[clang][cli] NFC: Decrease the scope of ParseLangArgs parameters"). ../src/gallium/frontends/clover/llvm/invocation.cpp: In function ‘std::unique_ptr<clang::CompilerInstance> {anonymous}::create_compiler_instance(const clover::device&, const string&, const std::vector<std::__cxx11::basic_string<char> >&, std::string&)’: ../src/gallium/frontends/clover/llvm/invocation.cpp:252:55: error: cannot convert ‘clang::PreprocessorOptions’ to ‘std::vector<std::__cxx11::basic_string<char> >&’ 252 \| c->getPreprocessorOpts(), \| ~~~~~~~~~~~~~~~~~~~~~~^~ \| \| \| clang::PreprocessorOptions Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4114 Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8543>	2021-03-02 09:16:53 +00:00
Iago Toral Quiroga	05f8efbc2c	broadcom/compiler: allow pipelining of flat and noperspective varyings These end up having a NOP between the ldvary and the next instruction in the sequence (a MOV for flat and an FADD for noperspetive): nop ; nop ; ldvary.r0 nop ; nop fadd rf6, r0, r5 ; nop ; ldvary.r1 nop ; nop fadd rf5, r1, r5 ; nop ; ldvary.r2 nop ; nop fadd rf4, r2, r5 ; nop ; ldvary.r3 To pipeline these, we can reuse the same infrastructure we have in place for smooth varyings but we need to avoid breaking the sequence due to the NOP instruction. We do that by testing if dropping the sequence when we failed to pick up the next instruction also fails to choose an instruction. This is not perfect, because we may be able to choose an instruction outside the sequence such as an ldunif, and use that to break a sequence that we could otherwise continue after scheduling the NOP instruction, but it is still better than nothing. total instructions in shared programs: 13820690 -> 13819774 (<.01%) instructions in affected programs: 64026 -> 63110 (-1.43%) helped: 479 HURT: 62 Instructions are helped. total max-temps in shared programs: 2326435 -> 2326423 (<.01%) max-temps in affected programs: 102 -> 90 (-11.76%) helped: 7 HURT: 0 Max-temps are helped. total sfu-stalls in shared programs: 30683 -> 30710 (0.09%) sfu-stalls in affected programs: 13 -> 40 (207.69%) helped: 2 HURT: 24 Sfu-stalls are HURT. total inst-and-stalls in shared programs: 13851373 -> 13850484 (<.01%) inst-and-stalls in affected programs: 62818 -> 61929 (-1.42%) helped: 466 HURT: 65 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9304>	2021-03-02 07:56:00 +01:00
Iago Toral Quiroga	1784dd22a3	broadcom/compiler: pipeline smooth ldvary sequences Typically, we would schedule smooth varyings like this: nop ; nop ; ldvary.r4 nop ; fmul r0, r4, rf0 fadd rf13, r0, r5 ; nop ; ldvary.r1 nop ; fmul r2, r1, rf0 fadd rf12, r2, r5 ; nop ; ldvary.r3 nop ; fmul r4, r3, rf0 fadd rf11, r4, r5 ; nop ; ldvary.r0 where we pair up an ldvary with the fadd of the previous sequence instead of the previous fmul. This is because ldvary has an implicit write to r5 which is read by the fadd of the previous sequence, so our dependency tracking doesn't allow us to move the ldvary before the fadd, however, the r5 write of the ldvary instruction happens in the instruction after it is emitted so we can actually move it to the fmul and the r5 write would still happen in the same instruction as the fadd, which is fine. This patch allows us to pipeline these sequences optimally. For that, after merging an ldvary into a previous instruction in the middle of a pipelineable ldvary sequence, we check if we can manually move it to the last scheduled instruction instead (the one before the instruction we are currently scheduling). If we are successful at moving the ldvary to the previous instruction, then we flag the ldvary as scheduled immediately, which may promote its children (the follow-up fmul instruction for that ldvary) to DAG heads and continue the merge loop so that fmul can be picked and merged into the final fadd of the previous sequence (where we had originally merged the ldvary). This leads to a result that looks like this: nop ; nop ; ldvary.r4 nop ; fmul r0, r4, rf0 ; ldvary.r1 fadd rf13, r0, r5 ; fmul r2, r1, rf0 ; ldvary.r3 fadd rf12, r2, r5 ; fmul r4, r3, rf0 ; ldvary.r0 Shader-db results: total instructions in shared programs: 14071591 -> 13820690 (-1.78%) instructions in affected programs: 7809692 -> 7558791 (-3.21%) helped: 41209 HURT: 4528 Instructions are helped. total max-temps in shared programs: 2335784 -> 2326435 (-0.40%) max-temps in affected programs: 84302 -> 74953 (-11.09%) helped: 4561 HURT: 293 Max-temps are helped. total sfu-stalls in shared programs: 31537 -> 30683 (-2.71%) sfu-stalls in affected programs: 3551 -> 2697 (-24.05%) helped: 1713 HURT: 750 Sfu-stalls are helped. total inst-and-stalls in shared programs: 14103128 -> 13851373 (-1.79%) inst-and-stalls in affected programs: 7820726 -> 7568971 (-3.22%) helped: 41411 HURT: 4535 Inst-and-stalls are helped. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9304>	2021-03-02 07:56:00 +01:00
Iago Toral Quiroga	1d021539a2	broadcom/compiler: track pipelineable ldvary sequences If we have two (or more) smooth varyings like this: nop t3; ldvary.rf0 fmul t5, t3, t0 fadd t6, t5, r5 nop t7; ldvary.rf0 fmul t9, t7, t0 fadd t10, t9, r5 nop t11; ldvary.rf0 fmul t13, t11, t0 fadd t14, t13, r5 We may be able to pipeline them like this: nop ; nop ; ldvary.r4 nop ; fmul r0, r4, rf0 ; ldvary.r1 fadd rf13, r0, r5 ; fmul r2, r1, rf0 ; ldvary.r3 fadd rf12, r2, r5 ; fmul r4, r3, rf0 ; ldvary.r0 But in order to do this, we will need to manually tweak the QPU scheduling. This patch tracks information about ldvary sequences that are good candidates for pipelining, and a follow-up patch will use this information to pipeline them when we emit the QPU code. v2 (apinheiro): - Rename the v3d_compile fields to avoid confusion with the qinst fields. - Assert that a sequence's start instruction is not the same as the end. Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9304>	2021-03-02 07:56:00 +01:00

... 4 5 6 7 8 ...

136212 Commits All Branches Search

136212 Commits

All Branches