DRI3:
- Only slows clears can enable it for the first frame.
- A good PS/draw ratio can enable it for other frames.
DRI2:
- Only slows clears can enable it for a frame.
- Page-flipped color buffers are unref'd at the end of each frame,
so it can't be enabled in any other way.
- Relying on slow clears is sufficient for our synthetic benchmarks.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
DCC for displayable surfaces is allocated in a separate buffer and is
enabled or disabled based on PS invocations from 2 frames ago (to let
queries go idle) and the number of slow clears from the current frame.
At least an equivalent of 5 fullscreen draws or slow clears must be done
to enable DCC. (PS invocations / (width * height) + num_slow_clears >= 5)
Pipeline statistic queries are always active if a color buffer that can
have separate DCC is bound, even if separate DCC is disabled. That means
the window color buffer is always monitored and DCC is enabled only when
the situation is right.
The tracking of per-texture queries in r600_common_context is quite ugly,
but I don't see a better way.
The first fast clear always enables DCC. DCC decompression can disable it.
A later fast clear can enable it again. Enable/disable typically happens
only once per frame.
The impact is expected to be negligible because games usually don't have
a high level of overdraw. DCC usually activates when too much blending
is happening (smoke rendering) or when testing glClear performance and
CMASK isn't supported (Stoney).
v2: rename stuff, add assertions
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
We could also do MSAA resolve in a compute shader like Vulkan and remove
these workarounds.
v2: comment the magic numbers
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Allocating it has no effect, but it adds overhead (useless DCC clear).
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Also add dcc_fast_clear_size for clearing only the necessary subset
of DCC. For no AA, it's equal to the size of the whole DCC level.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We want to keep DCC enabled to save bandwidth. It was a bad idea to disable
it here.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
This will get more complicated with mipmapped DCC or when DCC is enabled
after allocation.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
R9G9B9E5 is the only uncompressed one hopefully.
This fixes incorrect rendering not discovered (due to a lack of tests)
until DCC mipmapping was enabled.
Cc: 11.1 11.2 12.0 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
We don't import textures with DCC now, but soon we will.
v2: if we can't disable DCC for image writes, at least decompress DCC
at bind time
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This improves throughput by keeping TTM overhead down.
Some piglit tests such as texelFetch and streaming-texture-leak will
use less memory now.
v2: use gart_size / 4 as the threshold
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Next commits will add other things around this.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
to allow reallocating the texture storage with different parameters
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Just for consistency. This doesn't fix anything, because DCC is not
supported with non-mipmapped textures.
v1.1: fix the comment about DCC
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
this is more robust and probably fixes some bugs already
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
a staging cube texture with array_size % 6 != 0 doesn't work very well
just use 2D_ARRAY or 2D for all staging textures
Cc: 11.1 11.2 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
this is a leftover from the days when depth-stencil buffers were
allocated by the DDX
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
This hasn't been needed, but I think we should set it.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
Changes:
- don't flush DB for fast color clears
- don't flush any caches for initial clears
- remove the flag from si_copy_buffer, always assume shader coherency
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
For some formats we need to take "do_endian_swap" into account when
configuring swapping for color buffers.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Because r600 GPUs can't do swap in their DB unit, we need to disable
endianess swapping for textures that are handled by DB.
There are four format translation functions in r600g driver:
- r600_translate_texformat
- r600_colorformat_endian_swap
- r600_translate_colorformat
- r600_translate_colorswap
This patch adds a new parameters to those functions, called
"do_endian_swap". When running in a big-endian machine, the calling
functions will check whether the texture/color is handled by DB -
"rtex->is_depth && !rtex->is_flushing_texture" - and if so, they will
send FALSE through this parameter. Otherwise, they will send TRUE.
The translation functions, in specific cases, will look at this parameter
and configure the swapping accordingly.
v4:
evergreen_init_color_surface_rat() is only used by compute and don't
handle DB surfaces, so just sent hard-coded FALSE to translation
functions when called by it.
Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com>
Cc: "11.1 11.2" <mesa-stable@lists.freedesktop.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Use PIPE_SWIZZLE_* everywhere.
Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE.
The new enum is called pipe_swizzle.
Acked-by: Jose Fonseca <jfonseca@vmware.com>
Just for consistency. This is actually not a problem, because both addrlib
and radeon check and fix this.
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>