Commit Graph

131459 Commits

Author SHA1 Message Date
Dave Airlie 87c70f1984 lavapipe: enable pipeline stats queries
These pass CTS, but I think are missing some stuff CTS doesn't test.

This is one of the base Vulkan 1.0 features and I'd like to support
it for conformance.

Cc: "20.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
2020-11-24 06:50:41 +10:00
Dave Airlie 4263162839 lavapipe: fixup mipmap precsion bits
8 seems more correct, however it fixes a bunch of explict lod
tests but breaks some lod query tests.

Cc: "20.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
2020-11-24 06:50:37 +10:00
Dave Airlie 2c0a078fdb llvmpipe: fix multisample lines.
This also needs another lines fix, but at least align the code
with tri and points

Cc: "20.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
2020-11-24 06:50:34 +10:00
Dave Airlie d932720ff7 llvmpipe: fix multisample point rendering.
Fixes one case in
dEQP-VK.rasterization.primitives_multisample_4_bit.no_stipple.points

Cc: "20.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
2020-11-24 06:50:31 +10:00
Dave Airlie 2ed54033de llvmpipe/setup: move point stats collection earlier.
You have to count the stats pre-culling here.

Just like dc261cdd42 did for lines.

VK-GL-CTS dEQP-VK.query_pool.statistics_query.clipping_primitives*point_list

Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
2020-11-24 06:50:28 +10:00
Dave Airlie f246456538 lavapipe: fix wsi acquire fences
Fixes:
dEQP-VK.wsi.xcb.swapchain.acquire.too_many

Cc: "20.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
2020-11-24 06:50:24 +10:00
Dave Airlie 0d90c7cbc4 lavapipe: fixup device allocate + enable private data
I'd only half ported private memory support, finish the job.

Cc: "20.3" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Roland Scheidegger <sroland@vmware.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7705>
2020-11-24 06:50:21 +10:00
Erik Faye-Lund 2ac396e2e5 zink: fix layered resolves
Until recently, we ended up using u_blitter here, because
info->render_condition_enable was always true here. But when we recently
fixed that overly broad check, this broke.

So let's fix layered-resolves, by actually checking if the resource has
layers respect them in that case, similar to what we do in blit_native.

Fixes: 19906022e2 ("zink: more accurately track supported blits")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3843
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7737>
2020-11-23 19:35:40 +00:00
Dylan Baker 989877365d release-calender: Update 20.3
I've been forgetting to remove completed rc's

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7739>
2020-11-23 19:32:06 +00:00
Dylan Baker f60fabc38f docs: update calendar and link releases notes for 20.2.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7739>
2020-11-23 19:32:06 +00:00
Dylan Baker 9c2e8a8f90 docs: Add relnotes for 20.2.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7739>
2020-11-23 19:32:06 +00:00
Dylan Baker ad2b120087 docs: add release notes for 20.2.3
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7739>
2020-11-23 19:32:06 +00:00
Samuel Pitoiset 8e961b91c3 aco: optimize v_add+v_lshlrev to v_mad_u32_u24 on GFX6-8
This optimizes v_add(c, v_lshlrev(a, b)) to v_mad_u32_u24(b, 1<<a, c)
if 'a' is a constant (less than or equal to 6 to avoid creating
literals) and 'b' known to be a 16-bit or a 24-bit value.

On GFX9+, this is already optimized to v_lshl_add_u32.

No fossils-db changes.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>
2020-11-23 18:34:40 +00:00
Samuel Pitoiset d9e4504b0d aco: optimize v_add+s_lshl to v_mad_u32_u24 on GFX6-8
This optimizes v_add(c, s_lshl(a, b)) to v_mad_u32_u24(a, 1<<b, c)
if 'b' is a constant (less than or equal to 6 to avoid creating
literals) and 'a' known to be a 16-bit or a 24-bit value.

On GFX9+, this is already optimized to v_lshl_add_u32.

fossils-db (Polaris10):
Totals from 1916 (1.36% of 140385) affected shaders:
SGPRs: 88322 -> 87780 (-0.61%); split: -0.66%, +0.05%
CodeSize: 7852668 -> 7851800 (-0.01%); split: -0.01%, +0.00%
Instrs: 1533965 -> 1530459 (-0.23%); split: -0.23%, +0.00%
Cycles: 57001852 -> 56983244 (-0.03%); split: -0.03%, +0.00%
VMEM: 372561 -> 371733 (-0.22%); split: +0.03%, -0.25%
SMEM: 108859 -> 103711 (-4.73%); split: +0.23%, -4.96%
VClause: 37231 -> 37204 (-0.07%)
SClause: 58116 -> 58086 (-0.05%); split: -0.06%, +0.01%
Copies: 199953 -> 199931 (-0.01%); split: -0.03%, +0.02%
Branches: 63478 -> 63477 (-0.00%)
PreSGPRs: 61818 -> 61816 (-0.00%)

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>
2020-11-23 18:34:40 +00:00
Samuel Pitoiset eaef1f2127 aco: allow to use the range analysis UB in emit_{sop2,vop2}_instruction()
It will allow to combine v_add+s_lshl or v_add+v_lshlrev to
v_mad_u32_u24 on GFX6-8 if operands are known to be 16-bit or 24-bit.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>
2020-11-23 18:34:40 +00:00
Samuel Pitoiset be600b009a aco: add a new Operand flag to indicate that is 24-bit
To indicate that the upper 8-bits are always 0 to optimize more MADs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>
2020-11-23 18:34:40 +00:00
Samuel Pitoiset 05fd780012 aco/tests: extend the optimize.add_lshl tests to GFX8
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7673>
2020-11-23 18:34:40 +00:00
Samuel Pitoiset cd59c22325 ac,radv: use better export formats for 8-bit when RB+ isn't allowed
When RB+ is enabled, R8_UINT/R8_SINT/R8_UNORM should use FP16_ABGR
for 2x exporting performance. Otherwise, use 32_R to remove useless
instructions needed for 16-bit compressed exports.

fossils-db (Vega10):
Totals from 8858 (6.35% of 139517) affected shaders:
SGPRs: 801248 -> 801210 (-0.00%); split: -0.01%, +0.00%
VGPRs: 596224 -> 596120 (-0.02%); split: -0.02%, +0.01%
CodeSize: 71462452 -> 71356684 (-0.15%); split: -0.15%, +0.00%
MaxWaves: 37097 -> 37105 (+0.02%); split: +0.04%, -0.02%
Instrs: 13963177 -> 13950809 (-0.09%); split: -0.09%, +0.00%
Cycles: 1476539360 -> 1476489996 (-0.00%); split: -0.00%, +0.00%
VMEM: 2363008 -> 2361349 (-0.07%); split: +0.04%, -0.11%
SMEM: 550362 -> 549977 (-0.07%); split: +0.01%, -0.08%
VClause: 245704 -> 245727 (+0.01%); split: -0.01%, +0.02%
SClause: 485161 -> 485104 (-0.01%); split: -0.01%, +0.00%
Copies: 1420034 -> 1422310 (+0.16%); split: -0.01%, +0.17%
Branches: 518710 -> 518705 (-0.00%)
PreSGPRs: 706633 -> 706584 (-0.01%)
PreVGPRs: 547163 -> 547007 (-0.03%); split: -0.03%, +0.01%

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7512>
2020-11-23 17:54:16 +00:00
Samuel Pitoiset 684531fd37 radv: add new vk_format_is_*() helpers
I think we should make RADV uses util_format everywhere.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7512>
2020-11-23 17:54:16 +00:00
Dylan Baker a5227465c1 meson: use a feature option for microsoft-clc
It's less code and makes the configuration easier to fine tune.

Fixes: ff05da7f8d
       ("microsoft: Add CLC frontend and kernel/compute support to DXIL converter")

Reviewed-by: Jesse Natalie <jenatali@microsoft.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7699>
2020-11-23 17:31:55 +00:00
Dylan Baker 7ca4a478ad meson: Don't add extra values to shader-cache
We're trying to move to using a feature here, adding more values breaks
that.

Fixes: 5de56937a3
       ("disk_cache: build option for disabled-by-default")

Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7699>
2020-11-23 17:31:55 +00:00
Rob Clark a92f597b98 freedreno/ir3: Fix valgrind complaint about streamout state
The warning is a bit misleading about where it shows up.. it complains
about the shader key, due to shader key being calculated from (among
other things) stream_output state that had some uninitialized garbage
in the padding.

==84572== Uninitialised byte(s) found during client check request
==84572==    at 0x60548E8: blob_write_bytes (blob.c:163)
==84572==    by 0x6534EF7: compute_variant_key (ir3_disk_cache.c:111)
==84572==    by 0x6535143: ir3_disk_cache_retrieve (ir3_disk_cache.c:171)
==84572==    by 0x654D82F: create_variant (ir3_shader.c:251)
==84572==    by 0x654DA2B: ir3_shader_get_variant (ir3_shader.c:301)
==84572==    by 0x645B2CB: ir3_shader_variant (ir3_gallium.c:113)
==84572==    by 0x645B7EB: ir3_shader_create (ir3_gallium.c:219)
==84572==    by 0x645BAA7: ir3_shader_state_create (ir3_gallium.c:285)
==84572==    by 0x6506003: fd6_shader_state_create (fd6_program.c:1136)
==84572==    by 0x64676C7: assemble_tgsi (freedreno_program.c:105)
==84572==    by 0x64679DF: fd_prog_init (freedreno_program.c:188)
==84572==    by 0x6506157: fd6_prog_init (fd6_program.c:1172)
==84572==  Address 0xeff1588 is 424 bytes inside a block of size 480 alloc'd
==84572==    at 0x4866FA4: malloc (vg_replace_malloc.c:307)
==84572==    by 0x605D46F: ralloc_size (ralloc.c:133)
==84572==    by 0x605D52F: rzalloc_size (ralloc.c:166)
==84572==    by 0x654DFF7: ir3_shader_from_nir (ir3_shader.c:473)
==84572==    by 0x645B6C7: ir3_shader_create (ir3_gallium.c:182)
==84572==    by 0x645BAA7: ir3_shader_state_create (ir3_gallium.c:285)
==84572==    by 0x6506003: fd6_shader_state_create (fd6_program.c:1136)
==84572==    by 0x64676C7: assemble_tgsi (freedreno_program.c:105)
==84572==    by 0x64679DF: fd_prog_init (freedreno_program.c:188)
==84572==    by 0x6506157: fd6_prog_init (fd6_program.c:1172)
==84572==    by 0x64CB36F: fd6_context_create (fd6_context.c:154)
==84572==    by 0x59D93BB: st_api_create_context (st_manager.c:917)

Somehow this was showing up with dEQP-GLES31.info.vendor but not other
things.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>
2020-11-23 16:04:52 +00:00
Rob Clark 9de6a601ce freedreno/drm: Quiet timedout error msg
This isn't terribly interesting, but got more chatty when we converted
to mesa_loge() vs debug_printf()

Fixes: 156d7e45f7 ("freedreno: Convert to mesa_log*()")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>
2020-11-23 16:04:52 +00:00
Rob Clark 98d182fd46 freedreno/a6xx: Clear control mem at context create
We could be getting a recycled bo containing random garbage, which can
confuse check_vsc_overflow().

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>
2020-11-23 16:04:52 +00:00
Rob Clark 150a914a78 freedreno: Convert one last mtx_t -> simple_mtx_t
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>
2020-11-23 16:04:52 +00:00
Rob Clark 8651cfbbf0 freedreno: emit_marker() cleanup
1) Propagate the change to only emit markers in debug builds (and add
   the WFI that ensures they are synchronized with GPU.  We could
   consider dropping them entirely, since the GPU devcoredump support
   in newer kernels is more useful.  But it is still an occasionally
   useful fallback.

2) Use p_atomic_inc_return() to placate helgrind

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7717>
2020-11-23 16:04:52 +00:00
Lionel Landwerlin b039e03f55 mesa: add an environment variable to default enable INTEL_blackhole
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7640>
2020-11-23 11:56:48 +00:00
Lionel Landwerlin f5610d9949 st: trigger noop if the default value is not true
v2: Verify that PIPE_CAP_FRONTEND_NOOP is available before calling vfunc (Icecream95)

Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7640>
2020-11-23 11:56:48 +00:00
Connor Abbott 76ade57fa6 ir3/ra: Fix array reg liveness in scalar pass
Assigning an array reg removes IR3_REG_ARRAY, which means that
definitions and uses can't be tracked back to the array register's name
and liveness for the components of the array aren't correctly
calculated. To fix this we delay assigning array registers until the
scalar pass.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7711>
2020-11-23 11:33:13 +00:00
Samuel Pitoiset 88b5a2b80b nir: fix gathering cross invocation info
Fixes: 5b77b14448 ("nir: Use src_is_invocation_id in get_deref_info.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7730>
2020-11-23 11:00:17 +00:00
jzielins 79bd8edd87 swr: Pass draw start information to state update mechanism
This fixes crash in many workloads/tests

Reviewed-by: Krzysztof Raszkowski <krzysztof.raszkowski@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7728>
2020-11-23 10:15:28 +00:00
Samuel Pitoiset c83cc49f6b ci: fix name of the Sienna Cichlid expected failures file
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7729>
2020-11-23 10:15:05 +01:00
Alejandro Piñeiro ce5c23eb00 v3dv/cmd_buffer: missing (uint8_t *) casting when calling memcmp
Caused to return early wrongly on CmdPushConstants with some tests
using several calls to that method. As we are here we are also
replacing the (void *) casting at the memcpy below.

Fixes: e1c8041cde ("v3dv: try harder to skip emission of redundant state")

Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7718>
2020-11-23 09:51:24 +01:00
Samuel Pitoiset 14ec91b131 radv: dump BO ranges into bo_ranges.log instead of stderr
Like other dumps during GPU hang detection.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>
2020-11-23 08:44:54 +01:00
Samuel Pitoiset 4ffa6acb0d radv: add RADV_DEBUG=noumr to disable UMR logs during GPU hang detection
Sometimes UMR logs can't be dumped and you would get permission
denied, even if the UMR binary has the setuid bit enabled.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>
2020-11-23 08:44:52 +01:00
Samuel Pitoiset a61a398f7e radv: dump application info in the GPU hang report
Like the name, version, as well as the engine and the API version.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>
2020-11-23 08:44:52 +01:00
Samuel Pitoiset 8d7f78ccf8 radv: append a time string to the hang report dump directory
Using the PID only isn't really informative.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>
2020-11-23 08:44:52 +01:00
Samuel Pitoiset 15e1b530f6 radv: print more debug messages when generating a hang report
If for some reasons the driver can't generate the hang report properly.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7706>
2020-11-23 08:44:52 +01:00
Marek Olšák f7364c9fe0 radeonsi: don't allocate LDS for TCS inputs if it's not used
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák a4ba51e5be radeonsi: don't insert barrier between VS/TCS if all TCS inputs come from VGPRs
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák 61fe66a2e4 radeonsi: pass VS->TCS IO via VGPRs if VS and TCS have the same thread count
It can only be done if a TCS input is accessed without indirect indexing and
with gl_InvocationID as the vertex index, and the number of VS and TCS threads
is the same.

This eliminates LDS stores and loads for VS->TCS IO, reducing shader lifetime
and LDS traffic.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák 6f13034265 ac/llvm: prepare for passing VS->TCS IO via VGPRs
- bump AC_MAX_ARGS
- add vertex_index_is_invoc_id parameter into load_tess_varyings

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák 98b2aacfbf radeonsi: remove unnecessary NULL checking in NIR tess functions
param_index is always checked for non-NULL later.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák 1190808eca radeonsi: if VS and TCS have the same number of threads, merge the conditonals
Instead of:
    if (VS) {
	VS;
    }
    if (TCS) {
	TCS;
    }

Do this if the number of threads is the same in VS and TCS:
    exec = enabled_threads;
    VS;
    TCS;

Skipping declare_vb_descriptor_input_sgprs is needed to match the VS return
values.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák 0aba174361 radeonsi: always return void from si_build_wrapper_function
It's the end of the shader, there are no return values.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák a56e92c79e radeonsi: merge TCS and TCS epilog conditional blocks
Instead of:
    if (TCS) {
       TCS;
    }
    if (TCS && epilog) {
       epilog;
    }

Do:
    if (TCS) {
	TCS;
	if (epilog) {
	    epilog;
    }

Only monolithic shaders can do it.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák c605de30eb radeonsi: don't generate a dead conditional in si_write_tess_factors on gfx9+
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák 5df5ee2722 radeonsi: limit HS LDS usage per workgroup to 16K to allow at least 2 WGs/CU
This increases occupancy when the LDS size is e.g. 20K for 3 waves.
If we limit the size to 16K, we can fit 2 workgroups with 2 waves each,
so 4 waves in total.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák bdee9dc633 radeonsi: don't allocate LDS for TCS outputs if they are not read
This reduces LDS usage by 50% in Unigine Heaven.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00
Marek Olšák 10beddf659 radeonsi: don't leave more than 8 unoccupied lanes in HS
Previously it was 16 and bigger patches would always trim the patch count
needlessly.

There are 2 variables to consider:
- lane occupancy
- LDS usage (limiting wave occupancy)

If LDS size is 32 KB (max limit per CU) for 3 waves and we can't maximize
occupancy, it's better to leave some lanes unoccupied because using 2
waves would decrease the LDS size to 21 KB, which is not enough to fit
another workgroup on the CU.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7623>
2020-11-23 02:22:21 +00:00