KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Dave Airlie	2c0a078fdb	llvmpipe: fix multisample lines. This also needs another lines fix, but at least align the code with tri and points Cc: "20.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>	2020-11-24 06:50:34 +10:00
Dave Airlie	d932720ff7	llvmpipe: fix multisample point rendering. Fixes one case in dEQP-VK.rasterization.primitives_multisample_4_bit.no_stipple.points Cc: "20.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>	2020-11-24 06:50:31 +10:00
Dave Airlie	2ed54033de	llvmpipe/setup: move point stats collection earlier. You have to count the stats pre-culling here. Just like `dc261cdd42` did for lines. VK-GL-CTS dEQP-VK.query_pool.statistics_query.clipping_primitives*point_list Cc: <mesa-stable@lists.freedesktop.org> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>	2020-11-24 06:50:28 +10:00
Dave Airlie	f246456538	lavapipe: fix wsi acquire fences Fixes: dEQP-VK.wsi.xcb.swapchain.acquire.too_many Cc: "20.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>	2020-11-24 06:50:24 +10:00
Dave Airlie	0d90c7cbc4	lavapipe: fixup device allocate + enable private data I'd only half ported private memory support, finish the job. Cc: "20.3" <mesa-stable@lists.freedesktop.org> Reviewed-by: Adam Jackson <ajax@redhat.com> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>	2020-11-24 06:50:21 +10:00
Erik Faye-Lund	2ac396e2e5	zink: fix layered resolves Until recently, we ended up using u_blitter here, because info->render_condition_enable was always true here. But when we recently fixed that overly broad check, this broke. So let's fix layered-resolves, by actually checking if the resource has layers respect them in that case, similar to what we do in blit_native. Fixes: `19906022e2` ("zink: more accurately track supported blits") Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3843 Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7737>	2020-11-23 19:35:40 +00:00
Dylan Baker	989877365d	release-calender: Update 20.3 I've been forgetting to remove completed rc's Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7739>	2020-11-23 19:32:06 +00:00
Dylan Baker	f60fabc38f	docs: update calendar and link releases notes for 20.2.3 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7739>	2020-11-23 19:32:06 +00:00
Dylan Baker	9c2e8a8f90	docs: Add relnotes for 20.2.3 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7739>	2020-11-23 19:32:06 +00:00
Dylan Baker	ad2b120087	docs: add release notes for 20.2.3 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7739>	2020-11-23 19:32:06 +00:00
Samuel Pitoiset	8e961b91c3	aco: optimize v_add+v_lshlrev to v_mad_u32_u24 on GFX6-8 This optimizes v_add(c, v_lshlrev(a, b)) to v_mad_u32_u24(b, 1<<a, c) if 'a' is a constant (less than or equal to 6 to avoid creating literals) and 'b' known to be a 16-bit or a 24-bit value. On GFX9+, this is already optimized to v_lshl_add_u32. No fossils-db changes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>	2020-11-23 18:34:40 +00:00
Samuel Pitoiset	d9e4504b0d	aco: optimize v_add+s_lshl to v_mad_u32_u24 on GFX6-8 This optimizes v_add(c, s_lshl(a, b)) to v_mad_u32_u24(a, 1<<b, c) if 'b' is a constant (less than or equal to 6 to avoid creating literals) and 'a' known to be a 16-bit or a 24-bit value. On GFX9+, this is already optimized to v_lshl_add_u32. fossils-db (Polaris10): Totals from 1916 (1.36% of 140385) affected shaders: SGPRs: 88322 -> 87780 (-0.61%); split: -0.66%, +0.05% CodeSize: 7852668 -> 7851800 (-0.01%); split: -0.01%, +0.00% Instrs: 1533965 -> 1530459 (-0.23%); split: -0.23%, +0.00% Cycles: 57001852 -> 56983244 (-0.03%); split: -0.03%, +0.00% VMEM: 372561 -> 371733 (-0.22%); split: +0.03%, -0.25% SMEM: 108859 -> 103711 (-4.73%); split: +0.23%, -4.96% VClause: 37231 -> 37204 (-0.07%) SClause: 58116 -> 58086 (-0.05%); split: -0.06%, +0.01% Copies: 199953 -> 199931 (-0.01%); split: -0.03%, +0.02% Branches: 63478 -> 63477 (-0.00%) PreSGPRs: 61818 -> 61816 (-0.00%) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>	2020-11-23 18:34:40 +00:00
Samuel Pitoiset	eaef1f2127	aco: allow to use the range analysis UB in emit_{sop2,vop2}_instruction() It will allow to combine v_add+s_lshl or v_add+v_lshlrev to v_mad_u32_u24 on GFX6-8 if operands are known to be 16-bit or 24-bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>	2020-11-23 18:34:40 +00:00
Samuel Pitoiset	be600b009a	aco: add a new Operand flag to indicate that is 24-bit To indicate that the upper 8-bits are always 0 to optimize more MADs. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>	2020-11-23 18:34:40 +00:00
Samuel Pitoiset	05fd780012	aco/tests: extend the optimize.add_lshl tests to GFX8 Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Rhys Perry <pendingchaos02@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>	2020-11-23 18:34:40 +00:00
Samuel Pitoiset	cd59c22325	ac,radv: use better export formats for 8-bit when RB+ isn't allowed When RB+ is enabled, R8_UINT/R8_SINT/R8_UNORM should use FP16_ABGR for 2x exporting performance. Otherwise, use 32_R to remove useless instructions needed for 16-bit compressed exports. fossils-db (Vega10): Totals from 8858 (6.35% of 139517) affected shaders: SGPRs: 801248 -> 801210 (-0.00%); split: -0.01%, +0.00% VGPRs: 596224 -> 596120 (-0.02%); split: -0.02%, +0.01% CodeSize: 71462452 -> 71356684 (-0.15%); split: -0.15%, +0.00% MaxWaves: 37097 -> 37105 (+0.02%); split: +0.04%, -0.02% Instrs: 13963177 -> 13950809 (-0.09%); split: -0.09%, +0.00% Cycles: 1476539360 -> 1476489996 (-0.00%); split: -0.00%, +0.00% VMEM: 2363008 -> 2361349 (-0.07%); split: +0.04%, -0.11% SMEM: 550362 -> 549977 (-0.07%); split: +0.01%, -0.08% VClause: 245704 -> 245727 (+0.01%); split: -0.01%, +0.02% SClause: 485161 -> 485104 (-0.01%); split: -0.01%, +0.00% Copies: 1420034 -> 1422310 (+0.16%); split: -0.01%, +0.17% Branches: 518710 -> 518705 (-0.00%) PreSGPRs: 706633 -> 706584 (-0.01%) PreVGPRs: 547163 -> 547007 (-0.03%); split: -0.03%, +0.01% Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7512>	2020-11-23 17:54:16 +00:00
Samuel Pitoiset	684531fd37	radv: add new vk_format_is_*() helpers I think we should make RADV uses util_format everywhere. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7512>	2020-11-23 17:54:16 +00:00
Dylan Baker	a5227465c1	meson: use a feature option for microsoft-clc It's less code and makes the configuration easier to fine tune. Fixes: `ff05da7f8d` ("microsoft: Add CLC frontend and kernel/compute support to DXIL converter") Reviewed-by: Jesse Natalie <jenatali@microsoft.com> Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7699>	2020-11-23 17:31:55 +00:00
Dylan Baker	7ca4a478ad	meson: Don't add extra values to shader-cache We're trying to move to using a feature here, adding more values breaks that. Fixes: `5de56937a3` ("disk_cache: build option for disabled-by-default") Reviewed-by: Eric Engestrom <eric@engestrom.ch> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7699>	2020-11-23 17:31:55 +00:00
Rob Clark	a92f597b98	freedreno/ir3: Fix valgrind complaint about streamout state The warning is a bit misleading about where it shows up.. it complains about the shader key, due to shader key being calculated from (among other things) stream_output state that had some uninitialized garbage in the padding. ==84572== Uninitialised byte(s) found during client check request ==84572== at 0x60548E8: blob_write_bytes (blob.c:163) ==84572== by 0x6534EF7: compute_variant_key (ir3_disk_cache.c:111) ==84572== by 0x6535143: ir3_disk_cache_retrieve (ir3_disk_cache.c:171) ==84572== by 0x654D82F: create_variant (ir3_shader.c:251) ==84572== by 0x654DA2B: ir3_shader_get_variant (ir3_shader.c:301) ==84572== by 0x645B2CB: ir3_shader_variant (ir3_gallium.c:113) ==84572== by 0x645B7EB: ir3_shader_create (ir3_gallium.c:219) ==84572== by 0x645BAA7: ir3_shader_state_create (ir3_gallium.c:285) ==84572== by 0x6506003: fd6_shader_state_create (fd6_program.c:1136) ==84572== by 0x64676C7: assemble_tgsi (freedreno_program.c:105) ==84572== by 0x64679DF: fd_prog_init (freedreno_program.c:188) ==84572== by 0x6506157: fd6_prog_init (fd6_program.c:1172) ==84572== Address 0xeff1588 is 424 bytes inside a block of size 480 alloc'd ==84572== at 0x4866FA4: malloc (vg_replace_malloc.c:307) ==84572== by 0x605D46F: ralloc_size (ralloc.c:133) ==84572== by 0x605D52F: rzalloc_size (ralloc.c:166) ==84572== by 0x654DFF7: ir3_shader_from_nir (ir3_shader.c:473) ==84572== by 0x645B6C7: ir3_shader_create (ir3_gallium.c:182) ==84572== by 0x645BAA7: ir3_shader_state_create (ir3_gallium.c:285) ==84572== by 0x6506003: fd6_shader_state_create (fd6_program.c:1136) ==84572== by 0x64676C7: assemble_tgsi (freedreno_program.c:105) ==84572== by 0x64679DF: fd_prog_init (freedreno_program.c:188) ==84572== by 0x6506157: fd6_prog_init (fd6_program.c:1172) ==84572== by 0x64CB36F: fd6_context_create (fd6_context.c:154) ==84572== by 0x59D93BB: st_api_create_context (st_manager.c:917) Somehow this was showing up with dEQP-GLES31.info.vendor but not other things. Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>	2020-11-23 16:04:52 +00:00
Rob Clark	9de6a601ce	freedreno/drm: Quiet timedout error msg This isn't terribly interesting, but got more chatty when we converted to mesa_loge() vs debug_printf() Fixes: `156d7e45f7` ("freedreno: Convert to mesa_log*()") Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>	2020-11-23 16:04:52 +00:00
Rob Clark	98d182fd46	freedreno/a6xx: Clear control mem at context create We could be getting a recycled bo containing random garbage, which can confuse check_vsc_overflow(). Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>	2020-11-23 16:04:52 +00:00
Rob Clark	150a914a78	freedreno: Convert one last mtx_t -> simple_mtx_t Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>	2020-11-23 16:04:52 +00:00
Rob Clark	8651cfbbf0	freedreno: emit_marker() cleanup 1) Propagate the change to only emit markers in debug builds (and add the WFI that ensures they are synchronized with GPU. We could consider dropping them entirely, since the GPU devcoredump support in newer kernels is more useful. But it is still an occasionally useful fallback. 2) Use p_atomic_inc_return() to placate helgrind Signed-off-by: Rob Clark <robdclark@chromium.org> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>	2020-11-23 16:04:52 +00:00
Lionel Landwerlin	b039e03f55	mesa: add an environment variable to default enable INTEL_blackhole Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7640>	2020-11-23 11:56:48 +00:00
Lionel Landwerlin	f5610d9949	st: trigger noop if the default value is not true v2: Verify that PIPE_CAP_FRONTEND_NOOP is available before calling vfunc (Icecream95) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1) Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7640>	2020-11-23 11:56:48 +00:00
Connor Abbott	76ade57fa6	ir3/ra: Fix array reg liveness in scalar pass Assigning an array reg removes IR3_REG_ARRAY, which means that definitions and uses can't be tracked back to the array register's name and liveness for the components of the array aren't correctly calculated. To fix this we delay assigning array registers until the scalar pass. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7711>	2020-11-23 11:33:13 +00:00
Samuel Pitoiset	88b5a2b80b	nir: fix gathering cross invocation info Fixes: `5b77b14448` ("nir: Use src_is_invocation_id in get_deref_info.") Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Timur Kristóf <timur.kristof@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7730>	2020-11-23 11:00:17 +00:00
jzielins	79bd8edd87	swr: Pass draw start information to state update mechanism This fixes crash in many workloads/tests Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7728>	2020-11-23 10:15:28 +00:00
Samuel Pitoiset	c83cc49f6b	ci: fix name of the Sienna Cichlid expected failures file Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7729>	2020-11-23 10:15:05 +01:00
Alejandro Piñeiro	ce5c23eb00	v3dv/cmd_buffer: missing (uint8_t ) casting when calling memcmp Caused to return early wrongly on CmdPushConstants with some tests using several calls to that method. As we are here we are also replacing the (void ) casting at the memcpy below. Fixes: `e1c8041cde` ("v3dv: try harder to skip emission of redundant state") Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7718>	2020-11-23 09:51:24 +01:00
Samuel Pitoiset	14ec91b131	radv: dump BO ranges into bo_ranges.log instead of stderr Like other dumps during GPU hang detection. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>	2020-11-23 08:44:54 +01:00
Samuel Pitoiset	4ffa6acb0d	radv: add RADV_DEBUG=noumr to disable UMR logs during GPU hang detection Sometimes UMR logs can't be dumped and you would get permission denied, even if the UMR binary has the setuid bit enabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>	2020-11-23 08:44:52 +01:00
Samuel Pitoiset	a61a398f7e	radv: dump application info in the GPU hang report Like the name, version, as well as the engine and the API version. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>	2020-11-23 08:44:52 +01:00
Samuel Pitoiset	8d7f78ccf8	radv: append a time string to the hang report dump directory Using the PID only isn't really informative. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>	2020-11-23 08:44:52 +01:00
Samuel Pitoiset	15e1b530f6	radv: print more debug messages when generating a hang report If for some reasons the driver can't generate the hang report properly. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>	2020-11-23 08:44:52 +01:00
Marek Olšák	f7364c9fe0	radeonsi: don't allocate LDS for TCS inputs if it's not used Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	a4ba51e5be	radeonsi: don't insert barrier between VS/TCS if all TCS inputs come from VGPRs Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	61fe66a2e4	radeonsi: pass VS->TCS IO via VGPRs if VS and TCS have the same thread count It can only be done if a TCS input is accessed without indirect indexing and with gl_InvocationID as the vertex index, and the number of VS and TCS threads is the same. This eliminates LDS stores and loads for VS->TCS IO, reducing shader lifetime and LDS traffic. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	6f13034265	ac/llvm: prepare for passing VS->TCS IO via VGPRs - bump AC_MAX_ARGS - add vertex_index_is_invoc_id parameter into load_tess_varyings Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	98b2aacfbf	radeonsi: remove unnecessary NULL checking in NIR tess functions param_index is always checked for non-NULL later. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	1190808eca	radeonsi: if VS and TCS have the same number of threads, merge the conditonals Instead of: if (VS) { VS; } if (TCS) { TCS; } Do this if the number of threads is the same in VS and TCS: exec = enabled_threads; VS; TCS; Skipping declare_vb_descriptor_input_sgprs is needed to match the VS return values. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	0aba174361	radeonsi: always return void from si_build_wrapper_function It's the end of the shader, there are no return values. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	a56e92c79e	radeonsi: merge TCS and TCS epilog conditional blocks Instead of: if (TCS) { TCS; } if (TCS && epilog) { epilog; } Do: if (TCS) { TCS; if (epilog) { epilog; } Only monolithic shaders can do it. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	c605de30eb	radeonsi: don't generate a dead conditional in si_write_tess_factors on gfx9+ Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	5df5ee2722	radeonsi: limit HS LDS usage per workgroup to 16K to allow at least 2 WGs/CU This increases occupancy when the LDS size is e.g. 20K for 3 waves. If we limit the size to 16K, we can fit 2 workgroups with 2 waves each, so 4 waves in total. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	bdee9dc633	radeonsi: don't allocate LDS for TCS outputs if they are not read This reduces LDS usage by 50% in Unigine Heaven. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	10beddf659	radeonsi: don't leave more than 8 unoccupied lanes in HS Previously it was 16 and bigger patches would always trim the patch count needlessly. There are 2 variables to consider: - lane occupancy - LDS usage (limiting wave occupancy) If LDS size is 32 KB (max limit per CU) for 3 waves and we can't maximize occupancy, it's better to leave some lanes unoccupied because using 2 waves would decrease the LDS size to 21 KB, which is not enough to fit another workgroup on the CU. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:21 +00:00
Marek Olšák	9b5b5cbc53	radeonsi: adjust tess SGPRs to allow fully occupied 3 HS waves of triangles With triangles and 3 HS waves, 3 lanes were unoccupied. Adjust the SGPR encoding to allow 1 more triangle to fit there. Some of the fields are not large enough, but they weren't large enough before either. Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:20 +00:00
Marek Olšák	9659384744	ac/nir: fix a typo in ac_are_tessfactors_def_in_all_invocs I think it only made the pass return false if there was a barrier Fixes: `2832bc972b` - ac/nir_to_llvm: add ac_are_tessfactors_def_in_all_invocs() Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>	2020-11-23 02:22:20 +00:00

1 2 3 4 5 ...

131457 Commits All Branches Search

131457 Commits

All Branches