Commit Graph

133173 Commits

Author SHA1 Message Date
Michel Dänzer 6bde5cf276 ci: Rule out scheduled pipelines in .windows-build-rules
The lack of this broke scheduled pipelines, because they attempted
to create a meson-windows-vs2019 job, which couldn't work (because the
windows_build_vs2019 job doesn't exist in scheduled pipelines).

Fixes: 84c8a35aa2 "CI: Add Windows source dependency map"
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8360>
2021-01-08 09:57:21 +00:00
James Park 3fb4755d48 util: Disable memstream for Apple builds
Not all SDK versions support open_memstream. Maybe some other day.

Fixes: af8d488ea5 ("util,ac,aco,radv: Cross-platform memstream API")
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8269>
2021-01-08 09:37:14 +00:00
Samuel Pitoiset f40a7d3c93 radv: fix performance regression by restoring TC-compat HTILE in GENERAL
This fixes a performance regression for games (eg. Youngblood) that
declare all images as concurrent. This is likely buggy for compute
queues but this just restores the previous behaviour for now.

Fixes: f4f096805b ("radv: fix TC-compat HTILE images with DST_OPTIMAL on the compute queue")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8351>
2021-01-08 09:22:32 +00:00
Samuel Pitoiset 0ae1cf46a6 radv: fix enabling TC-compat HTILE in GENERAL for writes on GFX10+
It wasn't expected to also enable inside render loops.

Fixes: 4bb92d9145 ("radv: enable TC-compat HTILE in GENERAL on GFX10+")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8351>
2021-01-08 09:22:32 +00:00
Samuel Pitoiset 20683461e3 radv: configure the texture descriptor for TC-compat CMASK on GFX10+
This was missing, it can be enabled with RADV_PERFTEST=tccompatcmask.
Note that this feature is still experimental.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8350>
2021-01-08 08:21:17 +01:00
Vinson Lee e248119a82 r300: Fix typos.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8296>
2021-01-07 20:42:36 -08:00
Erik Faye-Lund f1b51d472a gallium/ntt: lower uniforms to ubo
NTT doesn't handle uniforms, and requires them to have been lowered to
UBOs. But for drivers that don't set
nir_shader_compiler_options::lower_uniforms_to_ubo to true, this won't
have happened yet. Neither Zink nor V3D sets this option, and in the
case of Zink this isn't trivial to change.

So let's lower uniforms to UBOs in this case in NTT instead.

Fixes: 03c60762f5 ("gallium/ntt: Fix load_ubo_vec4 buffer index setup.")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4047
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8365>
2021-01-07 23:09:49 +00:00
Nanley Chery 28a141e325 iris: Blit stencil according to aspect_mask
With this change, stencil picks up the fix for 3D texture blits
introduced with commit 382451ff9d.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8340>
2021-01-07 23:00:32 +00:00
Nanley Chery 1148da3436 iris: Use single-aspect formats more in iris_blit
In order to handle blitting the stencil aspect of a depth-stencil
resource, use aspect-specific pipe formats in the aspect_mask loop.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8340>
2021-01-07 23:00:32 +00:00
Nanley Chery db2cdc4277 iris: Blit non-stencil according to aspect_mask
When blitting just the stencil aspect, the source and destination
resources are prepared/setup twice. Move the unconditional resource
setup into the aspect_mask loop to avoid this.

In addition, use the aspect provided by the loop instead of the mask
provided by the info parameter.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8340>
2021-01-07 23:00:32 +00:00
Nanley Chery b73e903f96 iris: Loop through an aspect mask in iris_blit
Enables dropping the stencil-specific blit later on.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8340>
2021-01-07 23:00:31 +00:00
Nanley Chery 776074d66c iris: Increase use of pipe_resources in iris_blit
Allows the affected code to avoid being moved into a while loop later
on.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8340>
2021-01-07 23:00:31 +00:00
Nanley Chery 51d26e2edf iris: Use texture preparation helper in iris_blit
Use iris_resource_prepare_texture in iris_blit to avoid partial resolves
for sRGB <-> linear texture views. This affects a trace of L4D2.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8340>
2021-01-07 23:00:31 +00:00
Nanley Chery 04d73e2dc2 iris: Move depth-format assertion out of iris_blit
Instead of having a depth-specific assertion in a generic portion of
iris_blit, move it into the depth-specific cases of
iris_resource_texture_aux_usage. Since iris_blit calls that function,
the test still occurs.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8340>
2021-01-07 23:00:31 +00:00
Nanley Chery ce3a6dfa79 iris: Don't prepare depth for stencil-aspect blits
Before this change, iris_blit would prepare the depth buffer in a
depth-stencil resource even when only the stencil aspect was used for the
blit. Use the aspect mask to prepare the correct resource.

Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8340>
2021-01-07 23:00:31 +00:00
Adam Jackson a7762daa67 mesa: Don't make building tests conditional on building DRI drivers
These tests should work, and be built, even if you're only building
gallium drivers.

Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8353>
2021-01-07 17:39:26 -05:00
Adam Jackson df4a7d67aa mesa: Fix array-format-to-format table on big-endian
The table constructor and the table lookup were doing different things
for big-endian. This fixes MesaFormatsTest.FormatFromFormatAndType and
MesaFormatsTest.FormatMatchesFormatAndType failing to round-trip for
GL_RGBA / GL_SHORT, which we're not currently running in CI for s390x,
but which a subsequent commit will enable.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8353>
2021-01-07 17:39:24 -05:00
Adam Jackson ab0d17338f tests: Fix memory leaks in DispatchSanity
Needed to pass asan CI.

Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8353>
2021-01-07 17:39:14 -05:00
Samuel Pitoiset d2f4934121 radv/llvm,aco: always split typed vertex buffer loads on GFX6 and GFX10+
To avoid any alignment issues that triggers memory violations and
eventually a GPU. This can happen if the stride (static or dynamic)
is unaligned and also if the VBO offset is aligned to scalar
(eg. stride is 8 and VBO offset is 2 for R16G16B16A16_SNORM).

The AMD Windows driver also always splits typed vertex fetches.

fossils-db (Sienna Cichlid):
Totals from 56508 (40.54% of 139391) affected shaders:
SGPRs: 2643545 -> 2664516 (+0.79%); split: -0.19%, +0.98%
VGPRs: 2007472 -> 1995408 (-0.60%); split: -0.74%, +0.13%
CodeSize: 70596372 -> 73913312 (+4.70%); split: -0.00%, +4.70%
MaxWaves: 772653 -> 774916 (+0.29%); split: +0.37%, -0.08%
Instrs: 14074162 -> 14567072 (+3.50%); split: -0.00%, +3.51%
Cycles: 69281276 -> 71253252 (+2.85%); split: -0.00%, +2.85%
VMEM: 22047039 -> 25554196 (+15.91%); split: +17.20%, -1.29%
SMEM: 4120370 -> 4360820 (+5.84%); split: +7.41%, -1.58%
VClause: 416913 -> 438361 (+5.14%); split: -1.86%, +7.01%
SClause: 536739 -> 542637 (+1.10%); split: -0.33%, +1.43%
Copies: 977194 -> 970015 (-0.73%); split: -2.43%, +1.69%
Branches: 241205 -> 241193 (-0.00%); split: -0.06%, +0.06%
PreVGPRs: 1505645 -> 1505379 (-0.02%)

This fixes GPU hangs with bin/draw-vertices from Piglit on GFX10+
with Zink.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8363>
2021-01-07 17:28:00 +00:00
Samuel Pitoiset 68c2537062 aco: fix creating the dest vector when 16-bit vertex fetches are splitted
Compute the number of components of the destination vector from the
bitsize when eg. a 16-bit vec2 vertex fetches is splitted. This is
because the dst will be a v1, so the p_create_vector should be created
from two v2b fro both sizes to match.

This prevents a regression from the next change which will split
typed vertex buffer loads on GFX6 and GFX10+.

Cc: mesa-stable
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8363>
2021-01-07 17:28:00 +00:00
Dylan Baker 26ec2c1a04 docs/release-calendar.rsv: Remove spaces
The generated entries don't have spaces, and the csv parser doesn't
like that some rows do and others don't have spaces.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8356>
2021-01-07 17:24:19 +00:00
Dylan Baker e05b52daf3 docs: Add calendar entries for 21.0 release candidates.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8356>
2021-01-07 17:24:19 +00:00
Rhys Perry f5adf27fb9 nir,radv: add and use nir_vectorize_tess_levels()
fossil-db (Sienna):
Totals from 1342 (0.97% of 138791) affected shaders:
CodeSize: 3287996 -> 3269572 (-0.56%); split: -0.56%, +0.00%
Instrs: 629896 -> 628191 (-0.27%); split: -0.31%, +0.04%
Cycles: 2619244 -> 2612424 (-0.26%); split: -0.30%, +0.04%
VMEM: 388807 -> 389273 (+0.12%); split: +0.14%, -0.02%
SMEM: 90655 -> 90700 (+0.05%); split: +0.06%, -0.01%
VClause: 21831 -> 21812 (-0.09%)
PreVGPRs: 44155 -> 44058 (-0.22%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry bfc777f83e radv: vectorize shader I/O
Fixes code size regressions after enabling TCS/TES for ACO.

fossil-db (Sienna):
Totals from 2588 (1.86% of 138791) affected shaders:
SGPRs: 109950 -> 108480 (-1.34%); split: -1.43%, +0.09%
VGPRs: 107764 -> 112060 (+3.99%); split: -0.03%, +4.02%
CodeSize: 5957760 -> 5321656 (-10.68%)
MaxWaves: 31718 -> 30358 (-4.29%); split: +0.03%, -4.32%
Instrs: 1116300 -> 1029000 (-7.82%)
Cycles: 4600344 -> 4251072 (-7.59%)
VMEM: 980024 -> 812978 (-17.05%); split: +1.14%, -18.18%
SMEM: 275458 -> 258227 (-6.26%); split: +2.34%, -8.60%
VClause: 42925 -> 30533 (-28.87%); split: -31.02%, +2.15%
SClause: 31554 -> 31362 (-0.61%); split: -1.79%, +1.18%
Branches: 15689 -> 15697 (+0.05%)
PreVGPRs: 80399 -> 83953 (+4.42%); split: -0.00%, +4.42%

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry f199b7188b nir/load_store_vectorize: add data as callback args
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry 00c8bec47b nir: add nir_load_store_vectorize_options
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Rhys Perry f4eb833a12 nir/load_store_vectorize: don't ignore subgroup memory barriers
Not sure why I thought this was correct, but we should consider them for
optimization purposes.

Fixes: ce9205c03b ('nir: add a load/store vectorization pass')
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4202>
2021-01-07 16:34:53 +00:00
Juan A. Suarez Romero 4d0b8a9d32 v3d: reinterpret stencil data as uint texture in stencil blit path
There is a path to blit stencil buffers reinterpreting the stencil data
as an RGBA8888 or R8 float texture.

This works fine except for the case when the stencil buffer is
multisampled, and the blit operation needs to resolve it: an average of
the samples is done, which is incorrect, as only one sample must be
used.

This can be observed n the piglit test
`ext_framebuffer_multisample-unaligned-blit 2 stencil downsample -auto
-fbo`, specifically in the triangles border.

To avoid this averaging, let's reinterpret the stencil data as RGBA8888
or R8 uint texture.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8361>
2021-01-07 16:09:31 +00:00
Rhys Perry cacce76db9 radv: workaround games which assume full subgroups if cswave32 is enabled
This assumption becomes incorrect with RADV_PERFTEST=cswave32.

Games include Detroit: Become Human and Doom Eternal.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7918>
2021-01-07 15:01:02 +00:00
Rhys Perry c73c246e05 nir: gather whether a compute shader uses non-quad subgroup intrinsics
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7918>
2021-01-07 15:01:02 +00:00
Rhys Perry 5bb94ab050 radv: implement CREATE_REQUIRE_FULL_SUBGROUPS_BIT with cswave32
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7918>
2021-01-07 15:01:02 +00:00
Michel Dänzer e8f50bd600 wsi/x11: Treat IMMEDIATE present mode the same as MAILBOX for Xwayland
Two main reasons:

As described in the previous commit, sending buffers to the Wayland
compositor as quickly as possible effectively results in mailbox
behaviour.

Also, doing the same as for MAILBOX present mode provides the following
benefits:

* We use more images in the swapchain, which avoids stalls on the client
  side if the Wayland compositor directly uses the client buffers for
  scanout.

* We wait for fences to signal before submitting a new buffer, which
  avoids missing frames in the Wayland compositor due to fences not
  signalling in time for a flip.

Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/3673
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8197>
2021-01-07 15:00:45 +01:00
Michel Dänzer 4292fb2139 wsi/x11: Use PresentOptionAsync for MAILBOX present mode with Xwayland
This allows Xwayland to forward buffers to the Wayland compositor ASAP
for fullscreen / undecorated windows, which in turn allows true mailbox
behaviour in the Wayland compositor.

Without this, Xwayland has to emulate the mailbox behaviour itself,
which it cannot do as well as the Wayland compositor by design.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8197>
2021-01-07 15:00:07 +01:00
Michel Dänzer b5268d532a wsi/x11: Detect Xwayland
The following commits will introduce different behaviour for Xwayland.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8197>
2021-01-07 14:58:24 +01:00
Michel Dänzer 1de2fd0cf2 wsi/x11: Always link against xcb-xrandr
The next commit will make use of it even without
VK_USE_PLATFORM_XLIB_XRANDR_EXT.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8197>
2021-01-07 14:57:45 +01:00
Michel Dänzer 1cce8e1101 wsi/x11: Set recognizable name for WSI swapchain queue thread
This makes it easier to recognize the thread e.g. in a debugger.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8197>
2021-01-07 14:56:41 +01:00
Pierre-Eric Pelloux-Prayer 07c1504d1b radeonsi: implement SQTT support
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:10:17 +01:00
Pierre-Eric Pelloux-Prayer a46e830444 radeonsi: add radeon_set_uconfig_reg_seq_perfctr
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:10:17 +01:00
Pierre-Eric Pelloux-Prayer df5233b977 ac/sqtt: move radv_get_expected_buffer_size to ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:10:16 +01:00
Pierre-Eric Pelloux-Prayer ea6176e63e ac/sqtt: move ac_is_thread_trace_complete to ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:10:14 +01:00
Pierre-Eric Pelloux-Prayer ffdfe136e6 ac/sqtt: move rgp/sqtt def to ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:09:57 +01:00
Pierre-Eric Pelloux-Prayer 4ec5cf5318 ac/radv: move radv_rgp.c to ac
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:09:49 +01:00
Pierre-Eric Pelloux-Prayer bbc245ab2e ac/radv: move sqtt structs and helpers to amd/common
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:09:47 +01:00
Pierre-Eric Pelloux-Prayer 04f6ba113c ac/sqtt: add ac_thread_trace_data
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:09:45 +01:00
Pierre-Eric Pelloux-Prayer b94104c0c0 radeonsi: pass radeon_cmdbuf to si_cp_dma_wait_for_idle
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:09:25 +01:00
Pierre-Eric Pelloux-Prayer aa9fe1e423 radeonsi: pass radeon_cmdbuf to emit_cache_flush
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8002>
2021-01-07 10:09:25 +01:00
Alistair Popple 7f9a084e7e gv100/ir: Use system wide atomics
Increase the scope of atomic operations from GPU to system. This is
required for support of SVM to ensure atomic access is maintained for
memory buffers that are not local to the current GPU.

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7955>
2021-01-07 08:59:10 +00:00
Alistair Popple b02e3053ea gv100/ir: Make emitATOM consistent with emitRED
GV100 code generation uses ATOM instructions for compare-and-swap and
RED instructions for other atomic operations. Make the scope consistent
for both types of operations.

Signed-off-by: Alistair Popple <apopple@nvidia.com>
Reviewed-by: Ben Skeggs <bskeggs@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7955>
2021-01-07 08:59:10 +00:00
Marek Olšák 62703b79a5 radeonsi: remove si_gs_prolog_bits::gfx9_prev_is_vs
It didn't do anything useful. GS doesn't use the other user SGPRs.
If we decrease the number of user SGPRs we declare for the GS prolog,
we can remove gfx9_prev_is_vs.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8344>
2021-01-06 23:28:04 -05:00
Marek Olšák b6b6d1ff3c radeonsi: fix hang caused by for loop with exec=0 in LS and ES
LLVM expects that exec != 0 when entering loops and generates this code
that becomes an infinite loop if exec == 0:

BB5_1:
    vcc_lo = (inverted terminating condition)
    s_and_b32 vcc_lo, exec_lo, vcc_lo
    s_cbranch_vccnz BB5_3    // jump if vcc != 0 (break statement)
    // ... loop body ...
    s_branch BB5_1
BB5_3:

For non-monolithic VS before TCS, VS before GS, and TES before GS,
we set exec = (thread enabledmask), which sets 0 for HS-only and GS-only
waves, causing the infinite loop condition above.

Fix it as follows:
- set exec = ~0 at the beginning
- wrap the whole shader (LS and ES) in a conditional block, so that HS-only
  and GS-only waves jump over it and never enter such a loop

The TES before GS hang can be reproduced by gfxbench:
    testfw_app --gfx egl -w 1920 -h 1080 --gl_api gles -t gl_tess

Fixes: 68d6d097f1 - radeonsi/gfx9: add GFX9 and VEGA10 enums

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8344>
2021-01-06 23:28:01 -05:00