Commit Graph

228 Commits

Author SHA1 Message Date
Marek Olšák 1f31a21664 radeonsi: remove SDMA support
There are many issues with SDMA across many generations of hardware.
A recent example is that gfx10.3 suffers from random GPU hangs if
userspace uses SDMA.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7908>
2020-12-09 00:52:26 +00:00
Bas Nieuwenhuizen d4f7962d48 radeonsi: Add displayable DCC flushing without explicit flushes.
Flushes non-explicit shared textures that need retiling on

* glFlush
* glSync
* glSignalSemaphoreEXT
* DRI fences.
* The first time we create a non-explicit handle for it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6176>
2020-11-13 03:27:28 +00:00
Pierre-Eric Pelloux-Prayer 5e4aecec93 radeonsi: introduce SI_RESOURCE_FLAG_INTERNAL / RADEON_FLAG_DRIVER_INTERNAL
Tag allocations as driver internal.
Some of these allocations will need to be doubled to handle TMZ (one secure bo,
one normal bo) but these allocations shouldn't switch the winsys in "the app
is using TMZ".

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049>
2020-09-24 14:51:16 +00:00
Pierre-Eric Pelloux-Prayer 0294eaed80 radeonsi: extend workaround for KHR-GL45.texture_view.view_classes on gfx9
This is a followup of 19db1a540c.
This commit fixed KHR-GL45.texture_view.view_classes on gfx9 but the test
still failed when using AMD_DEBUG=nodma or AMD_DEBUG=nodcc,nodma.

The workaround is now used from si_resource_copy_region so it covers the
previous call site (si_texture_transfer_map) and the sctx->dma_copy
fallback code.

Fixes: 19db1a540c ("radeonsi: add a workaround to fix KHR-GL45.texture_view.view_classes on gfx9")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6115>
2020-08-05 19:45:32 +00:00
Marek Olšák d30e1e486d radeonsi: don't enable TC-compatible HTILE for stencil if stencil doesn't use it
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5095>
2020-05-23 03:45:09 -04:00
Marek Olšák b5ac9d18d8 radeonsi: use vi_dcc_enabled instead of using tex->surface.dcc_offset directly
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4935>
2020-05-15 22:12:35 +00:00
Marek Olšák 0d83e7f4b9 radeonsi: enable TC-compatible HTILE on demand for best Z/S performance
I haven't measured this, but it can only help.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4866>
2020-05-05 16:27:29 +00:00
Marek Olšák d6acdbd935 radeonsi: implement and use compute-based DCC decompression on gfx9-10
DCC_DECOMPRESS doesn't work. Instead of trying to figure out why,
use a compute blit where the load is compressed and the store is
uncompressed.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4761>
2020-04-30 22:27:31 +00:00
Pierre-Eric Pelloux-Prayer d7008fe46a radeonsi: switch to 3-spaces style
Generated automatically using clang-format and the following config:

AlignAfterOpenBracket: true
AlignConsecutiveMacros: true
AllowAllArgumentsOnNextLine: false
AllowShortCaseLabelsOnASingleLine: false
AllowShortFunctionsOnASingleLine: false
AlwaysBreakAfterReturnType: None
BasedOnStyle: LLVM
BraceWrapping:
  AfterControlStatement: false
  AfterEnum: true
  AfterFunction: true
  AfterStruct: false
  BeforeElse: false
  SplitEmptyFunction: true
BinPackArguments: true
BinPackParameters: true
BreakBeforeBraces: Custom
ColumnLimit: 100
ContinuationIndentWidth: 3
Cpp11BracedListStyle: false
Cpp11BracedListStyle: true
ForEachMacros:
  - LIST_FOR_EACH_ENTRY
  - LIST_FOR_EACH_ENTRY_SAFE
  - util_dynarray_foreach
  - nir_foreach_variable
  - nir_foreach_variable_safe
  - nir_foreach_register
  - nir_foreach_register_safe
  - nir_foreach_use
  - nir_foreach_use_safe
  - nir_foreach_if_use
  - nir_foreach_if_use_safe
  - nir_foreach_def
  - nir_foreach_def_safe
  - nir_foreach_phi_src
  - nir_foreach_phi_src_safe
  - nir_foreach_parallel_copy_entry
  - nir_foreach_instr
  - nir_foreach_instr_reverse
  - nir_foreach_instr_safe
  - nir_foreach_instr_reverse_safe
  - nir_foreach_function
  - nir_foreach_block
  - nir_foreach_block_safe
  - nir_foreach_block_reverse
  - nir_foreach_block_reverse_safe
  - nir_foreach_block_in_cf_node
IncludeBlocks: Regroup
IncludeCategories:
  - Regex:           '<[[:alnum:].]+>'
    Priority:        2
  - Regex:           '.*'
    Priority:        1
IndentWidth: 3
PenaltyBreakBeforeFirstCallParameter: 1
PenaltyExcessCharacter: 100
SpaceAfterCStyleCast: false
SpaceBeforeCpp11BracedList: false
SpaceBeforeCtorInitializerColon: false
SpacesInContainerLiterals: false

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4319>
2020-03-30 11:05:52 +00:00
Marek Olšák df34fa14bb radeonsi: don't invoke decompression inside internal launch_grid
Decompress resources properly but don't do it inside launch_grid
to prevent recursion.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Cc: 19.3 <mesa-stable@lists.freedesktop.org>
2020-01-20 15:40:08 -05:00
Pierre-Eric Pelloux-Prayer 7b0b085c94 radeonsi: drop the negation from fmask_is_not_identity
This change eases code reading ("fmask_is_identity = true" is clearer than
"fmask_is_not_identity = false").
Initialization is not changed so fmask_is_identity is false when a texture is
created.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
2020-01-15 10:10:15 +00:00
Pierre-Eric Pelloux-Prayer c2df5389bb radeonsi: make sure fmask expand is done if needed
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2248
Fixes: 095a58204d ("radeonsi: expand FMASK before MSAA image stores are used")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3174>
2020-01-15 10:10:15 +00:00
Marek Olšák 363b4027fc radeonsi: put up to 5 VBO descriptors into user SGPRs
gfx6-8: 1 VBO descriptor in user SGPRs
gfx9-10: 5 VBO descriptors in user SGPRs

We no longer pull up to 5 VBO descriptors from GTT when SDMA is disabled.

Totals from affected shaders:
SGPRS: 1110528 -> 1170528 (5.40 %)
VGPRS: 952896 -> 951936 (-0.10 %)
Spilled SGPRs: 83 -> 61 (-26.51 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 23766296 -> 22843920 (-3.88 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 179344 -> 179344 (0.00 %)
Wait states: 0 -> 0 (0.00 %)

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-13 15:57:07 -05:00
Marek Olšák 269953e779 radeonsi/gfx9: force the micro tile mode for MSAA resolve correctly on gfx9
Fixes: 69ea473 "amd/addrlib: update to the latest version"
Closes: #2325

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2020-01-09 16:28:28 -05:00
Marek Olšák 991328498b radeonsi: move SI and CIK+ SDMA code into 1 common function for cleanups
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2020-01-06 15:38:35 -05:00
Marek Olšák 503bd821fa radeonsi: rename SDMA debug flags
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-By: Timur Kristóf <timur.kristof@gmail.com>
2020-01-06 15:38:11 -05:00
Pierre-Eric Pelloux-Prayer f5c1cb2383 radeonsi: dcc dirty flag
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2019-12-10 09:25:28 +01:00
Eric Anholt 882ca6dfb0 util: Move gallium's PIPE_FORMAT utils to /util/format/
To make PIPE_FORMATs usable from non-gallium parts of Mesa, I want to
move their helpers out of gallium.  Since u_format used
util_copy_rect(), I moved that in there, too.

I've put it in a separate directory in util/ because it's a big chunk
of related code, and it's not clear to me whether we might want it as
a separate library from libmesa_util at some point.

Closes: #1905
Acked-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-11-14 10:47:20 -08:00
Marek Olšák 095a58204d radeonsi: expand FMASK before MSAA image stores are used
Image stores don't use FMASK, so we have to turn it into identity.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-10-09 17:12:36 -04:00
Marek Olšák ef919d8dcb radeonsi: remove redundant si_texture offset and size fields
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
2019-09-09 23:43:03 -04:00
Eric Engestrom abc226cf41 tree-wide: replace MAYBE_UNUSED with ASSERTED
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
2019-07-31 09:41:05 +01:00
Ilia Mirkin 0e30c6b8a7 gallium: switch boolean -> bool at the interface definitions
This is a relatively minimal change to adjust all the gallium interfaces
to use bool instead of boolean. I tried to avoid making unrelated
changes inside of drivers to flip boolean -> bool to reduce the risk of
regressions (the compiler will much more easily allow "dirty" values
inside a char-based boolean than a C99 _Bool).

This has been build-tested on amd64 with:

Gallium drivers: nouveau r300 r600 radeonsi freedreno swrast etnaviv v3d
                 vc4 i915 svga virgl swr panfrost iris lima kmsro
Gallium st:      mesa xa xvmc xvmc vdpau va

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
2019-07-22 22:13:51 -04:00
Marek Olšák 47f41af06c radeonsi: return success from vi_dcc_clear_level to simplify callers
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:54 -04:00
Marek Olšák fd92e65feb radeonsi: add si_shader_selector into si_compute
Now we can assume that shader->selector is always set.
This will simplify some code.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2019-07-19 20:16:43 -04:00
Marek Olšák 07aacdbfd5 radeonsi/gfx10: add a workaround for stencil HTILE with mipmapping
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Nicolai Hähnle 1666ee183e radeonsi/gfx10: implement hardware MSAA resolve
MSAA is only supported for 64KB_{R,Z}_X modes, so the micro tile
optimization that we use on gfx9 and earlier does not work.

Be very explicit about how the swizzle mode of the temporary surface is
selected.

Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2019-07-03 15:51:13 -04:00
Marek Olšák c53e6ea05d radeonsi: use a fragment shader blit instead of DB->CB copy for ZS CPU mappings
This mainly removes and simplifies code that is no longer needed.

There were some issues with the DB->CB stencil copy on gfx10, so let's
just use a fragment shader blit for all ZS mappings. It's more reliable.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2019-07-03 15:51:12 -04:00
Marek Olšák ccfcb9d818 ac: rename SI-CIK-VI to GFX6-GFX7-GFX8
Acked-by: Dave Airlie <airlied@redhat.com>

We already use GFX9 and I don't want us to have confusing naming
in the driver. GFXn naming is better from the driver perspective,
because it's the real version of the gfx portion of the hw. Also,
CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI.

It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have
nothing to do with GFXn and they have their own version numbers.
2019-05-15 20:54:10 -04:00
Marek Olšák 383f406591 radeonsi: remove dirty slot masks from scissor and viewport states
All registers in the array need to be updated if any of them is changed.

Only apps writing gl_ViewportIndex were affected by this bug.
2019-04-25 11:49:38 -04:00
Marek Olšák 1f21396431 radeonsi: add support for displayable DCC for multi-RB chips
A compute shader is used to reorder DCC data from aligned to unaligned.
2019-04-04 09:53:24 -04:00
Marek Olšák fe3bfd7971 radeonsi/gfx9: add support for PIPE_ALIGNED=0
Needed by displayable DCC.

We need to flush L2 after rendering if PIPE_ALIGNED=0 and DCC is enabled.
2019-04-04 09:53:24 -04:00
Marek Olšák a1378639ab radeonsi: always use compute rings for clover on CI and newer (v2)
initialize all non-compute context functions to NULL.

v2: fix SI
2019-02-26 14:58:55 -05:00
Sonny Jiang 1b25d340b7 radeonsi: use compute for resource_copy_region when possible
v2: marek: fix snorm8 blits

Signed-off-by: Sonny Jiang <sonny.jiang@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2019-01-22 12:24:35 -05:00
Marek Olšák b443465fb9 gallium/util: add util_format_snorm8_to_sint8 (from radeonsi) 2019-01-22 12:21:43 -05:00
Nicolai Hähnle 5c841a1b1e radeonsi: rename SI_RESOURCE_FLAG_FORCE_TILING to clarify its purpose
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2018-12-19 12:01:39 +01:00
Marek Olšák 203ef19f48 radeonsi: split si_copy_buffer
compute and SDMA will be added into it.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-09-10 15:19:56 -04:00
Marek Olšák 93b8b987d0 radeonsi: add a thorough clear/copy_buffer benchmark 2018-08-29 15:31:42 -04:00
Marek Olšák 0ca8294ece radeonsi: implement EXT_window_rectangles
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
2018-08-14 21:19:02 -04:00
Marek Olšák 41f80373b4 radeonsi: fix memory exhaustion issue with DCC statistics gathering with DRI2
Cc: 18.1 <mesa-stable@lists.freedesktop.org>
2018-06-28 22:27:25 -04:00
Marek Olšák d4755ef389 radeonsi: remove redundant si_texture::cmask_size
cmask_buffer and surface.cmask_size can replace its role.

Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák 2a8d1039b6 radeonsi: inline struct r600_cmask_info
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák 2d64a68c6f radeonsi: rename r600_surface -> si_surface
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
2018-06-25 18:33:58 -04:00
Marek Olšák 9c21002f6e radeonsi: handle non-clearable DCC buffers as MSAA resolve dst
This is reproducible on Stoney, but other chips may be affected too.

Cc 18.1 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-21 14:42:14 -04:00
Marek Olšák 1ba87f4438 radeonsi: rename r600_texture -> si_texture, rxxx -> xxx or sxxx
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
2018-06-19 13:08:50 -04:00
Marek Olšák b936f9aa32 radeonsi: disable primitive binning for all blitter ops
same as amdvlk.

Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-24 13:41:56 -04:00
Marek Olšák 835095973d radeonsi: remove r600_fmask_info
radeon_surf contains almost everything.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-05-10 18:26:33 -04:00
Marek Olšák 788d66553a radeonsi: rename r600_texture::resource to buffer
r600_resource could be renamed to si_buffer.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák de344209ad radeonsi: inline 2 trivial state structures
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák 639b673fc3 radeonsi: don't use an indirect table for state atoms
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2018-04-27 17:56:04 -04:00
Marek Olšák 8a28679987 radeonsi: don't do GFX-specific texture decompression for compute
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>
2018-04-13 12:31:04 -04:00