Commit Graph

228 Commits

Author SHA1 Message Date
Marek Olšák f564b61d33 radeonsi: rework clear_buffer flags
Changes:
- don't flush DB for fast color clears
- don't flush any caches for initial clears
- remove the flag from si_copy_buffer, always assume shader coherency

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-28 20:16:56 +02:00
Nicolai Hähnle 7a215a3e27 radeonsi: expclear must be disabled on first Z/S clear
The documentation and the HW team say so.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle 01a3bb5d8b radeonsi: move blend choice out of loop in si_blit_decompress_color
It does not depend on the level or layer.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle 450ff0f0d5 radeonsi: use level mask for early out in si_blit_decompress_color
Mostly for consistency with the other decompress functions, but note that
in the non-DCC decompress case, the function can now early-out in slightly
more (albeit probably rare) cases.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle 0ff05b55c6 radeonsi: si_blit_decompress_depth is only used for staging
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:40 -05:00
Nicolai Hähnle 0b70fc2db4 radeonsi: only decompress the required ZS planes from si_blit
This happens to "fix" a rendering bug in KotOR2, because it avoids a still
not quite understood bug with MSAA fast stencil clear decompress. For the
stencil clear bug, I have sent a piglit test (arb_texture_multisample-stencil-clear).

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93767
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle def53a0b3d radeonsi: decompress Z & S planes in one pass
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle dc6fc2f390 radeonsi: early out of si_blit_decompress_depth_in_place based on dirty mask
Avoid dirtying the db_render_state atom when possible.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle d14d6c3f58 radeonsi: use MIN2 instead of expanded ?: operator
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Nicolai Hähnle 159f182a57 radeonsi: fix brace style
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-27 11:16:39 -05:00
Marek Olšák 3acaefb1bb radeonsi: shorten slot masks to 32 bits
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-22 01:14:14 +02:00
Bas Nieuwenhuizen 061ce9399a radeonsi: split texture decompression for compute shaders
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-19 18:10:31 +02:00
Marek Olšák 2ca5566ed7 radeonsi: move scissor and viewport states into gallium/radeon
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Grigori Goronzy <greg@chown.ath.cx>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 17:13:24 +02:00
Marek Olšák f3eebb84eb radeonsi: implement and rely on set_active_query_state
Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-04-12 14:29:46 +02:00
Nicolai Hähnle 9d2693f58a radeonsi: expand the compressed color and depth texture masks to 64 bits
This is in preparation of raising the number of exposed sampler views to 32
bits, which will raise the total number of sampler views to 33 for the
polygon stipple texture. That texture should never be compressed (and it's
certainly not a depth texture), but this approach seems cleaner to me than
special-casing the last slot in all affected code paths.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-04-07 13:15:06 -05:00
Nicolai Hähnle 515fb2c09c radeonsi: decompress shader images
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-21 15:34:23 -05:00
Nicolai Hähnle 59c5508b9a radeonsi: update compressed_colortex_masks when a cmask is created or disabled
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-10 18:22:52 -05:00
Nicolai Hähnle da68a9b215 radeonsi: move si_decompress_textures to si_blit.c
Since it is all about calling into blitter functions, it makes more
sense here. This change also reduces the size of the interfaces between
.c files.

Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2016-03-10 18:22:49 -05:00
Marek Olšák f18fc70d6f radeonsi: disable DCC on handle export if expecting write access
This should be okay except that sampler views and images are not re-set.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Bas Nieuwenhuizen 1e48ec7571 radeonsi: add DCC decompression (v2)
This is currently not needed but will be necessary when we have
features that do not work with DCC enabled, such as image stores
and sharing non-scanout surfaces.

v2: Marek: rebase, remove decompression from si_flush_resource (not needed)

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák b744ac9f44 radeonsi: allocate DCC in the same backing buffer as the texture
To allow sharing textures with DCC enabled.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák 970b979da1 gallium/radeon: eliminate fast color clear before sharing
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-03-09 15:02:27 +01:00
Marek Olšák 7aedbbacae radeonsi: put image, fmask, and sampler descriptors into one array
The texture slot is expanded to 16 dwords containing 2 descriptors.
Those can be:
- Image and fmask, or
- Image and sampler state

By carefully choosing the locations, we can put all three into one slot,
with the fmask and sampler state being mutually exclusive.

This improves shaders in 2 ways:
- 2 user SGPRs are unused, shaders can use them as temporary registers now
- each pair of descriptors is always on the same cache line

v2: cosmetic changes: add back v8i32, don't load a sampler state & fmask
    at the same time

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-02-10 19:41:49 +01:00
Marek Olšák f6360de8c0 radeonsi: use all SPI color formats
because not using SPI_SHADER_32_ABGR doubles fill rate.

We should also get optimal performance if alpha isn't needed or blending
isn't enabled.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2016-01-22 15:02:40 +01:00
Marek Olšák 1a24f443b4 radeonsi: implement fast stencil clear
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-11 15:25:12 +01:00
Marek Olšák 6eff5415e4 gallium/radeon: simplify disabling render condition for u_blitter
just disable it by not setting the predication bit

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2015-11-13 19:54:41 +01:00
Marek Olšák f7757100f2 radeonsi: add glClearBufferSubData acceleration
8-bit and 16-bit clears which are not aligned to dwords are done in software.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2015-11-13 19:54:41 +01:00
Marek Olšák 19773f9805 radeonsi: add SI_SAVE_FRAGMENT_STATE blitter flag
Buffer clears via transform feedback won't set this.

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2015-11-13 19:54:41 +01:00
Marek Olšák e82c527f1f radeonsi: allow copying between compatible compressed and uncompressed formats
which is where a block in src maps to a pixel in dst and vice versa.
e.g. DXT1 <-> R32G32_UINT
     DXT5 <-> R32G32B32A32_UINT

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-28 11:52:17 +01:00
Marek Olšák 235d38584c radeonsi: properly check if DCC is enabled and allocated
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2015-10-27 10:49:24 +01:00
Bas Nieuwenhuizen 6529daca39 radeonsi: Implement DCC fast clear.
Uses the DCC buffer instead of the CMASK buffer. The ELIMINATE_FAST_CLEAR
still works. Furthermore, with DCC compression we can directly clear
to a limited set of colors such that we do not need a postprocessing step.

v2 Marek: check dcc_buffer && dirty_level_mask in set_sampler_view

Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-10-24 17:46:08 +02:00
Bas Nieuwenhuizen bb77467df9 radeonsi: Disable operations that do not work with DCC.
Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-10-24 00:42:24 +02:00
Marek Olšák edf6a4537c radeonsi: only apply the SNORM blit workaround to *8_SNORM
Like the comment says. This fixes DCC, which doesn't like blitting RG16
as RGBA8.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-24 00:01:20 +02:00
Marek Olšák 9b54ce3362 radeonsi: support thread-safe shaders shared by multiple contexts
The "current" shader pointer is moved from the CSO to the context, so that
the CSO is mostly immutable.

The only drawback is that the "current" pointer isn't saved when unbinding
a shader and it must be looked up when the shader is bound again.

This is also a prerequisite for multithreaded shader compilation.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-20 12:51:51 +02:00
Marek Olšák c23c92c965 radeonsi: only do depth-only or stencil-only in-place decompression
instead of always doing both.
Usually, only depth is needed, so stencil decompression is useless.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-10-03 22:06:08 +02:00
Marek Olšák e21418f221 radeonsi: convert stencil ref state into an atom
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:14 +02:00
Marek Olšák 74aa64876b radeonsi: convert sample mask state into an atom
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:14 +02:00
Marek Olšák 0c2eed0ede radeonsi: avoid redundant CB and DB register updates
The main idea is to avoid setting CB_COLORi_INFO = 0 for i>0 repeatedly
when those colorbuffers aren't used. This is mainly for glamor.

Same for DB. Z_INFO and STENCIL_INFO need to be cleared only once.

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:14 +02:00
Marek Olšák 8a97528b3a radeonsi: optimize viewport states
same as scissors

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:13 +02:00
Marek Olšák f6a10f60b7 radeonsi: optimize scissor states
- convert 16 states to 1 atom
- only emit 1 scissor if VIEWPORT_INDEX isn't written
- use only one packet when emitting consecutive scissors

Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
2015-09-01 21:51:13 +02:00
Marek Olšák ab0643225e util/u_blitter: implement alpha blending for pipe->blit 2015-08-21 22:21:45 +02:00
Grazvydas Ignotas 3206d4ed44 gallium/radeon: use helper functions to mark atoms dirty
This is analogous to r300_mark_atom_dirty() used by r300, and will
be used by later patches. For common radeon code, appropriate helper
is called through a function pointer.

No functional changes.

Signed-off-by: Marek Olšák <marek.olsak@amd.com>
2015-08-11 14:46:53 +02:00
Marek Olšák 3050978864 radeonsi: copy *8_SNORM bits exactly in resource_copy_region
Disabling the FP16 mode didn't help.

If needed, we can use this trick for blits too, but not for scaled blits.

+ 4 piglits

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-31 16:49:17 +02:00
Marek Olšák f9c4953f99 radeonsi: early exit in si_clear if there's nothing to do
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-31 16:49:17 +02:00
Marek Olšák d1f43a7e5b radeonsi: add code for creating, binding and destroying tessellation shaders
This doesn't do anything yet.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:31 +02:00
Marek Olšák 46b2b3bda8 radeonsi: don't change pipe_resource in resource_copy_region
Copied from r600g. pipe_resource can be shared by multiple threads, so we
shouldn't change it.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-07-23 00:59:24 +02:00
Dave Airlie 7e5064360c radeonsi: add support for viewport array (v3)
This isn't pretty and I'd suggest it the pm4 interface builder
could be tweaked to do this more efficently, but I'd need
guidance on how that would look.

This seems to pass the few piglit tests I threw at it.

v2: handle passing layer/viewport index to fragment shader.
fix crash in blit changes,
add support to io_get_unique_index for layer/viewport index
update docs.
v3: avoid looking up viewport index and layer in es (Marek).

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2015-06-27 00:24:07 +01:00
Marek Olšák edf18da85d radeonsi: only flush the right set of caches for CP DMA operations
That's either framebuffer caches or caches for shader resources.
The motivation is that framebuffer caches need to be flushed very rarely
here.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2015-01-07 12:06:43 +01:00
Marek Olšák 884f1654e2 radeonsi: move DB registers from draw_vbo into new db_render_state
It's called db_misc_state in r600g.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-24 14:48:02 +02:00
Marek Olšák d13d2fd161 r600g,radeonsi: add debug option which forces DMA for copy_region and blit 2014-09-12 22:51:28 +02:00
Marek Olšák a10c8db715 radeonsi: implement EXPCLEAR optimization for depth
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:52 +02:00
Marek Olšák 573313c94e radeonsi: implement fast depth clear
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:51 +02:00
Marek Olšák 63cb4077e6 radeonsi: move DB_RENDER_CONTROL into draw_vbo
So that I can add fast depth clear.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-09-01 21:18:51 +02:00
Marek Olšák db51ab6d6a radeonsi: use r600_draw_rectangle from r600g
Rectangles are easier than triangles for the rasterizer.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-08-19 12:20:18 +02:00
Marek Olšák 7792f9858b radeonsi: save scissor state and sample mask for u_blitter
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-08-19 12:20:18 +02:00
Marek Olšák a9528cef6b r600g,radeonsi: switch all occurences of array_size to util_max_layer
This fixes 3D texture support in all these cases, because array_size is 1
with 3D textures and depth0 actually contains the "array size".
util_max_layer is universal and returns the last layer index for any texture
target.

A lot of the cases below can't actually be hit with 3D textures, but let's
be consistent.

This fixes a failure in:
    piglit layered-rendering/clear-color-all-types 3d single_level
for r600g and radeonsi, which was caused by an incorrect CMASK size
calculation.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-28 23:57:08 +02:00
Marek Olšák 1635ded828 radeonsi: add support for fine-grained sampler view updates
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák dd46841bc9 radeonsi: move sampler descriptors from IB to memory
Sampler descriptors are now represented by si_descriptors.
This also adds support for fine-grained sampler state updates and
the border color update is now isolated in a separate function.

Border colors have been broken if texturing from multiple shader stages is
used. This patch doesn't change that.

BTW, blitting already makes use of fine-grained state updates.
u_blitter uses 2 textures at most, so we only have to save 2.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-07-18 01:58:59 +02:00
Marek Olšák ee2a818d33 radeonsi: rename definitions of shader limits
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-07-11 19:36:29 +02:00
Michel Dänzer 370184e813 Revert "radeonsi: Use dma_copy when possible for si_blit."
This reverts commit 5d5c20920e.

Caused visual corruption, see e.g.
https://bugs.freedesktop.org/show_bug.cgi?id=80827#c1
2014-07-03 11:17:38 +09:00
Axel Davy 5d5c20920e radeonsi: Use dma_copy when possible for si_blit.
This improves GLX DRI3 GPU offloading significantly on CPU
bound benchmarks particularly.
No performance impact for DRI2 GPU offloading.

v2: Add missing tests

Signed-off-by: Axel Davy <axel.davy@ens.fr>
Reviewed-by: Marek Olšák<marek.olsak@amd.com>
Signed-off-by: Dave Airlie <airlied@redhat.com>
2014-07-01 13:10:01 +10:00
Grigori Goronzy cf05f9bf01 radeonsi: add sampling of 4:2:2 subsampled textures
This makes 4:2:2 video surfaces work in VDPAU.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-06-18 13:58:37 +02:00
Marek Olšák d226191820 r600g,radeonsi: don't use hardware MSAA resolve if dst is fast-cleared
It doesn't work and our docs say so too.

Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-06-03 13:33:14 +02:00
Marek Olšák 0423513c61 radeonsi: BlitFramebuffer should follow render condition
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-06-03 13:33:14 +02:00
Marek Olšák a38e1fd78b radeonsi: implement fast color clear
This works for both multi-sample and single-sample color buffers.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-03-11 18:51:20 +01:00
Marek Olšák 6a5499b9d9 radeonsi: move framebuffer-related state to a new struct si_framebuffer
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-03-11 18:51:20 +01:00
Marek Olšák 9855477e90 r600g,radeonsi: consolidate create_surface and surface_destroy
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-02-25 16:08:26 +01:00
Marek Olšák b9aa8ed009 radeonsi: inline util_blitter_copy_texture
This will be used for changing texture properties without modifying
pipe_resource like r600g, but not in this series. For now, this change
allows consolidation of pipe_surface functions.

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-02-25 16:08:22 +01:00
Marek Olšák f7176d700f radeonsi: remove useless psbox variable from resource_copy_region
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
2014-02-25 16:08:20 +01:00
Michel Dänzer 404b29d765 radeonsi: Initial geometry shader support
Partly based on the corresponding r600g work by Vadim Girlin and Dave
Airlie.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-01-29 11:06:28 +09:00
Marek Olšák 7209703432 radeonsi: cleanup includes, add missing license
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-01-28 01:40:13 +01:00
Marek Olšák 62d55c0a2d radeonsi: use queries from r600g
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
2014-01-28 01:39:10 +01:00
Andreas Hartmetz 8662e66bf2 radeonsi: Rename the commonly occurring rctx/r600 variables.
The "r" stands for R600.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-01-14 00:07:14 +01:00
Andreas Hartmetz 7b7eb4dd1f radeonsi: Rename r600->si remaining identifiers in si_blit.c.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-01-14 00:07:13 +01:00
Andreas Hartmetz 45578def71 radeonsi: Rename r600->si for functions in si_pipe.h.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-01-14 00:07:13 +01:00
Andreas Hartmetz 280c360c02 radeonsi: Rename r600->si for functions in si.h.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-01-14 00:07:13 +01:00
Andreas Hartmetz 238aeabce0 radeonsi: Rename r600->si for structs in si_pipe.h.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-01-14 00:07:13 +01:00
Andreas Hartmetz 786af2f963 radeonsi: Apply si_* file naming scheme.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2014-01-14 00:07:13 +01:00
Renamed from src/gallium/drivers/radeonsi/r600_blit.c (Browse further)