Commit Graph

691 Commits

Author SHA1 Message Date
Marek Olšák 9fb77745f5 radeonsi: inline si_need_gfx_cs_space
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>
2021-09-01 00:42:58 +00:00
Marek Olšák b15c413947 radeonsi: simplify memory usage checking by merging vram and gtt counters
no change in behavior

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>
2021-09-01 00:42:58 +00:00
Marek Olšák c005b2cd4b radeonsi: move as_ls/es/ngg setting out of si_shader_selector_key
Do it when we bind shaders.

The advantages are:
- no need to memset the fields when any shader variant state is changed
  (e.g. culling on/off)
- no need to recompute the fields every time that happens

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>
2021-09-01 00:42:57 +00:00
Marek Olšák 08310f85ae radeonsi: remove instancing support from the prim discard compute shader
It's not important for workstation apps on Vega.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12656>
2021-09-01 00:42:57 +00:00
Marek Olšák 10a46226b1 gallium: remove vertices_per_patch, add pipe_context::set_patch_vertices
We would like draw-only display lists to have immutable draw info and
this is the only GL non-draw state in pipe_draw_info (not counting
view_mask).

It also allows removing some code from draw_vbo for tessellation.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12351>
2021-08-21 00:08:11 +00:00
Marek Olšák 6fc38d3b07 radeonsi: allow arbitrary swizzle modes for displayable DCC
by adding retile shader variants

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12430>
2021-08-20 14:28:36 +00:00
Marek Olšák 59fe704c45 gallium: simplify VRAM uploads by adding PIPE_RESOURCE_FLAG_DONT_MAP_DIRECTLY
When this flag is set, u_threaded_context will try not to map it directly
for better buffer placement. It's set by drivers when visible VRAM is too
small.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12257>
2021-08-09 11:58:48 +00:00
Marek Olšák 6546f28cc8 radeonsi: drop smoothing quality to 4xAA for better performance
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11754>
2021-07-08 18:37:41 +00:00
Marek Olšák b141e50282 radeonsi: add optimal multi draws and draw-level splitting for prim discard CS
This is a partial rewrite of some parts of the code.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11510>
2021-06-28 13:23:14 +00:00
Marek Olšák 9fa0d2cf35 radeonsi: change how the prim discard CS is enabled and splitting limits
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11510>
2021-06-28 13:23:14 +00:00
Marek Olšák 06da711350 radeonsi: remove the GDS variants of compute-based primitive discard
The GDS ordered append variant is unstable due to kernel and firmware bugs.
The unordered GDS variant isn't faster than the memory-based variant.

Only the memory-based variant is kept.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11510>
2021-06-28 13:23:14 +00:00
Marek Olšák a448074d05 radeonsi: don't compile TES and GS draw_vbo variants for the prim discard CS
This also fixes the incorrect emit_draw_packets template argument.
The condition should be inverted.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11102>
2021-06-21 19:03:29 +00:00
Marek Olšák 72a395b6de radeonsi: remove the chip_class dimension from the draw_vbo array
We don't use/initialize draw_vbo callbacks for other generations anymore.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11384>
2021-06-16 21:29:13 +00:00
Marek Olšák 24895f020a radeonsi: move a few functions from si_state_draw.cpp into si_gfx_cs.c
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11384>
2021-06-16 21:29:13 +00:00
Pierre-Eric Pelloux-Prayer 83250036be radeonsi/nir: add si_nir_is_output_const_if_tex_is_const
Determine if a given shader write the same constant value to its output
if a specific input texture is replaced by constant load.

It's done by checking if the store_output intrinsics only depends on
constant and a texture. If it's true, the given texture is replaced by
a constant load in cloned shader and this clone is optimized.

Then the output is checked (= is it constant or not).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10979>
2021-06-15 11:18:02 +02:00
Pierre-Eric Pelloux-Prayer b2bd9c5ccd radeonsi: add si_install_draw_wrapper
This allows to implement custom draw_vbo code-path without
touching si_draw_vbo.

As an example, skipped all draw calls with an odd new_draws
could be done like this:

   void mywrapper(...) {
   	   if (new_draws % 2)
   	      return;
   	   return sctx->real_draw_vbo(...);
   }

   if (some_condition_is_met)
      si_install_draw_wrapper(sctx, mywrapper);

Instead of having to add the "if ()" condition inside si_draw_vbo.

Note that a single wrapper may be installed so care must be taken
to not override an existing wrapper.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10979>
2021-06-15 10:19:04 +02:00
Pierre-Eric Pelloux-Prayer ff8a930cf7 radeonsi: add _once suffix to depth_cleared_level_mask
And add a new variable to disambiguate between "has been cleared once" and
"is cleared".

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10979>
2021-06-15 10:19:02 +02:00
Mike Blumenkrantz 74abd5df0e aux/tc: pass rebind count and rebind bitmask with replace_buffer_storage func
tc already calculates all the rebinding that needs to be done on a given
context, so (some of) this info can be passed on to drivers to enable
optimizations

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11245>
2021-06-14 20:42:47 +00:00
Marek Olšák 7844bdadac radeonsi: remove DFSM after we discovered how bad it is
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
2021-05-25 16:15:44 +00:00
Marek Olšák 9ba17ec21a radeonsi: generate buffer_id_unique for u_threaded_context
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
2021-05-25 16:15:44 +00:00
Marek Olšák 9dc7fff448 radeonsi: allow changing the NGG subgroup size to 256 but don't change it yet
Currently, 128 seems to have the best performance.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
2021-05-25 16:15:44 +00:00
Marek Olšák 712f74f590 radeonsi: remove 8 bytes from si_resource, turn other 4 bytes into padding
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
2021-05-25 16:15:44 +00:00
Marek Olšák 5af124c92c radeonsi: change si_resource::alignment to alignment_log2 for better packing
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
2021-05-25 16:15:44 +00:00
Marek Olšák 36e07198a7 radeonsi: always use the L2 LRU cache policy for faster clears and copies
Waves and CP DMA can finish sooner if L2 doesn't do any evictions, which
is hard to predict.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
2021-05-25 16:15:44 +00:00
Marek Olšák c7e731c737 radeonsi: remove unused SI_IMAGE_ACCESS_AS_BUFFER
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
2021-05-25 16:15:44 +00:00
Marek Olšák 94a1f45e15 ac/llvm: set target features per function instead of per target machine
This is a cleanup that allows the removal of the wave32 target machine and
the wave32 pass manager.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10813>
2021-05-25 16:15:44 +00:00
Marek Olšák b04044b350 radeonsi: stop using u_resource_vtbl::resource_destroy
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10659>
2021-05-21 17:38:04 +00:00
Marek Olšák ec77a2d43a gallium/u_threaded: add callbacks and documentation for resource busy checking
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10662>
2021-05-17 10:37:24 +00:00
Marek Olšák 967757a208 gallium+(u_threaded,r300,r600,radeonsi): move transfer offset into pipe_transfer
Let's use the 4 bytes of unused padding usefully in pipe_transfer.

Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10527>
2021-05-01 17:38:42 +00:00
Mike Blumenkrantz dae3113c3d gallium: split drawid out of pipe_draw_info and as a separate draw_vbo param
the only case in which this is nonzero is if a multidraw gets split by the frontend,
i.e., mesa core, and in all other cases it can be ignored. the value can also be ignored
for all indirect draws, though it seems many (most?) gallium drivers are not aware of this

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10166>
2021-04-30 03:59:19 +00:00
Mike Blumenkrantz 4566383ae4 gallium: move pipe_draw_info::index_bias to pipe_draw_start_count_bias
this moves index_bias into the multidraw struct, enabling draws where the value
changes to be merged; the draw_info struct member is renamed and moved to the end
of the struct for tc use

u_vbuf still has some checks to split draws if index_bias changes, maybe
this can be removed at some point?

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10166>
2021-04-30 03:59:19 +00:00
Mike Blumenkrantz 4fe6c85526 gallium: rename pipe_draw_start_count -> pipe_draw_start_count_bias
and add an index_bias member

no functional changes yet, just the rename and unused struct member

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10166>
2021-04-30 03:59:19 +00:00
Marek Olšák 804e292440 radeonsi: remove the separate DCC optimization for Stoney
This removes some complexity from the driver.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10343>
2021-04-26 22:53:30 +00:00
Marek Olšák 1f8fa96412 radeonsi: make the gfx9 DCC MSAA clear shader depend on the number of samples
because different DCC equations are used.

Fixes: 3120113ee7 - radeonsi: implement DCC MSAA 4x/8x fast clear using DCC equations on gfx9

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10343>
2021-04-26 22:53:30 +00:00
Simon Ser 4a6b87ceab radeonsi: implement pipe_context.create_video_buffer_with_modifiers
Just pass down the modifier list to vl_video_buffer_create_as_resource,
filtering out DCC modifiers because we don't support these for now.

Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10237>
2021-04-22 15:57:29 +00:00
Marek Olšák a1653854f5 radeonsi: fix automatic DCC retiling after compute image stores
Only internal compute shaders use DCC stores, so the TODOs are not
critical yet.

Fixes: 1d64a1045e - radeonsi: enable dcc image stores on gfx10+

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10261>
2021-04-17 02:37:49 +00:00
Marek Olšák f9b527a9a5 radeonsi: unify internal compute with SSBOs in si_launch_grid_internal_ssbos
just deduplicate the code

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák ec60526035 radeonsi: move binding the internal compute shader into si_launch_grid_internal
instead of doing it in each function

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák 3120113ee7 radeonsi: implement DCC MSAA 4x/8x fast clear using DCC equations on gfx9
MSAA 4x and 8x should only clear the first 2 samples because other samples
are uncompressed. The compute shader only clears that subset of DCC.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák 8b95f51ef1 radeonsi: fix and enable full DCC with MSAA 2x on gfx9
This enables fast clear with any clear color (not just 0/1) for bpp >= 32.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák 7e68fae25f ac,radeonsi: rewrite DCC retiling without the DCC retile map
The retile map is removed and replaced by direct DCC address computations
in the retile shader using the new function ac_nir_dcc_addr_from_coord.

The RADV code is disabled.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák 06b6af596c radeonsi: do Z-only or S-only HTILE clear using a compute shader doing RMW
This adds a clear_buffer compute shader that does read-modify-write to
update a subset of bits in HTILE.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák 4dd8d58ad5 radeonsi: clean up some mess around htile_stencil_disabled
Set the final value in si_texture_create_object, so that other places
don't have to derive it redundantly.

The only thing to remember is that HTILE stencil can be enabled when
stencil is not present, and it can be disabled when stencil is present
due to various workarounds.

Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák bcd1a69f79 radeonsi: parallelize Z/S conversion into TC-compatible with fast color clears
It's not really a fast clear, but it's the next logical step towards doing
HTILE clears here.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák fb72d41b18 radeonsi: implement Z/S fast clear for non-zero mipmap levels
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10003>
2021-04-13 03:17:42 +00:00
Marek Olšák faf10bd49d ac/surface: use named "color and "zs" structures in unions
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Marek Olšák 468836317b ac/surface: unify htile_* and dcc_* fields as meta_* fields
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10083>
2021-04-12 20:53:45 +00:00
Pierre-Eric Pelloux-Prayer 8c6a64c9b0 radeonsi/rgp: export compute shader programs
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10105>
2021-04-12 14:27:29 +02:00
Pierre-Eric Pelloux-Prayer aa077ba3a2 radeonsi/rgp: export barriers
Wrap the si_cp_wait_mem call to emit RGP_SQTT_MARKER_IDENTIFIER_BARRIER_START and
RGP_SQTT_MARKER_IDENTIFIER_BARRIER_END events.

Only for gfx9+ for now.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10105>
2021-04-12 14:27:26 +02:00
Marek Olšák 0580d4c1a2 radeonsi: enable HTILE with mipmapping on gfx9+
Everything seems to be there except fast clears.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9795>
2021-04-02 12:05:00 +00:00