KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Samuel Pitoiset	48fe8a6210	radeonsi: only decompress resident textures/images when used When the current bound shaders don't use any bindless textures or images, it's useless to decompress the resident resources. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-14 10:04:36 +02:00
Samuel Pitoiset	e1813a8635	radeonsi: decompress resident textures/images before graphics/compute Similar to the existing decompression code path except that it loops over the list of resident textures/images. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-14 10:04:36 +02:00
Samuel Pitoiset	d7e1a66bb5	radeonsi: decompress DCC for resident textures/images Analogous to bound textures/images. We should also update the resident descriptors and disable COMPRESSION_EN for avoiding useless DCC fetches, but I postpone this optimization for a separate series. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-14 10:04:36 +02:00
Marek Olšák	c503381864	radeonsi: get rid of more compressed_colortex_mask names Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-12 18:24:37 +02:00
Marek Olšák	6940361796	gallium/radeon: don't allocate HTILE in a separate buffer Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-08 23:29:07 +02:00
Marek Olšák	c6451b1209	radeonsi: rename depth decompress functions Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-08 23:29:07 +02:00
Marek Olšák	d8a577d96e	radeonsi: rename shader resource decompress masks to their true meaning Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-08 23:29:07 +02:00
Samuel Pitoiset	878bd981bf	radeonsi: isolate real framebuffer changes from the decompression passes (v3) When a stencil buffer is part of the framebuffer state, it is decompressed but because it's bindless, all draw calls set stencil_dirty_level_mask to 1. v2: Marek - set the flags outside the loop - also clear and set framebuffer.do_update_surf_dirtiness there - do it in the DB->CB copy path too v3: Marek - save and restore the do_update_surf_dirtiness flag Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-07 20:17:14 +02:00
Marek Olšák	7d67cbefe0	radeonsi: clean up decompress blend state names Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-07 19:38:45 +02:00
Marek Olšák	d2ee423b69	radeonsi: enable TC-compatible stencil compression on VI Most things are in place. Ideally we won't see decompress blits for stencil anymore. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-07 19:38:39 +02:00
Marek Olšák	a893c91697	gallium/u_blitter: use 2D_ARRAY for cubemap blits if possible so that we can use TXF. The cubemap blit pixel shader code size: 148 -> 92 bytes Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-07 18:10:50 +02:00
Samuel Pitoiset	def02007cd	radeonsi: add new si_check_render_feedback_texture() helper For bindless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 23:05:41 +02:00
Samuel Pitoiset	fbcc8664fd	radeonsi: add new si_decompress_color_texture() helper For bindless. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 23:05:38 +02:00
Samuel Pitoiset	9cc91ba6d5	radeonsi: add a 'break' in si_check_render_feedback_*() No need to check all color buffers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-04-10 23:05:29 +02:00
Marek Olšák	3b1934d9b6	gallium/radeon: s/dcc_disable/disable_dcc/ Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-30 16:09:39 +02:00
Marek Olšák	45a71d5de5	radeonsi: handle incompatible DCC formats in resource_copy_region Required because of later commits. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>	2017-03-30 16:09:39 +02:00
Marek Olšák	b05b8587ae	radeonsi: remove a workaround for inexact *8_SNORM blits All tests pass on Fiji now. This prevents DCC disablement due to incompatible DCC formats due to the fallback. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com>	2017-03-30 16:09:39 +02:00
Marek Olšák	a955ee788f	gallium/radeon: add and use a new helper vi_dcc_enabled Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-03-30 16:09:37 +02:00
Marek Olšák	405bacd820	radeonsi/gfx9: fix MIP0_WIDTH & MIP0_HEIGHT for compressed texture blits Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	272b50a6f4	radeonsi/gfx9: do DCC clears on non-mipmapped textures only Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-03-30 14:44:33 +02:00
Marek Olšák	861d7af1cb	radeonsi: use a bitmask-based loop in si_decompress_textures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	a131dacb14	radeonsi: add CP DMA flags for greater control over synchronization for L2 prefetch Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Grazvydas Ignotas	c81a89f662	radeonsi: fix release build unused variable warnings Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-12-10 21:19:59 +01:00
Marek Olšák	f2b0c66c3c	radeonsi: properly declare context sampler states Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-12-07 18:46:54 +01:00
Marek Olšák	72d1669ed2	radeonsi: check for !is_linear in do_hardware_msaa_resolve We don't want opt4Space here. Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	00baaa4752	radeonsi: fix an assertion failure in si_decompress_sampler_color_textures This fixes a crash in Deus Ex: Mankind Divided. Release builds were unaffected, so it's not too serious. Cc: 11.2 12.0 13.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-04 11:30:47 +01:00
Marek Olšák	7786f8c635	gallium/radeon: add enum radeon_micro_mode Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	bf4d102ea3	gallium/radeon: add radeon_surf::is_linear Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	c66a550385	gallium/radeon: don't call u_format helpers if we have that info already Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	692f2640ab	gallium/radeon: replace radeon_surf_info::dcc_enabled with num_dcc_levels Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-01 22:33:13 +01:00
Marek Olšák	d4d9ec55c5	radeonsi: implement TC-compatible HTILE so that decompress blits aren't needed and depth texturing needs less memory bandwidth. Z16 and Z24 are promoted to Z32_FLOAT by the driver, because TC-compatible HTILE only supports Z32_FLOAT. This doubles memory footprint for Z16. The format promotion is not visible to state trackers. This is part of TC-compatible renderbuffer compression, which has 3 parts: DCC, HTILE, FMASK. Only TC-compatible FMASK compression is missing now. I don't see a measurable increase in performance though. (I tested Talos Principle and DiRT: Showdown, the latter is improved by 0.5%, which is almost noise, and it originally used layered Z16, so at least we know that Z16 promoted to Z32F isn't slower now) Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-13 19:00:51 +02:00
Marek Olšák	21de3be8e6	radeonsi: fix texture format reinterpretation with DCC DCC is limited in how texture formats can be reinterpreted using texture views. If we get a view format that is incompatible with the initial texture format with respect to DCC, disable DCC. There is a new piglit which tests all format combinations. What works and what doesn't was deduced by looking at the piglit failures. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Marek Olšák	5ee3cac138	radeonsi: increase performance for DRI PRIME offloading if 2nd GPU is CIK or VI SDMA is much faster for tiled->linear blits from VRAM to GTT. I have Bonaire in my second PCIe slot. $ glxinfo \| grep OpenGL.renderer OpenGL renderer string: Gallium 0.4 on AMD TONGA ... $ DRI_PRIME=1 glxinfo \| grep OpenGL.renderer OpenGL renderer string: Gallium 0.4 on AMD BONAIRE ... Without SDMA: $ DRI_PRIME=1 glxgears 8796 frames in 5.0 seconds = 1759.074 FPS 8899 frames in 5.0 seconds = 1779.672 FPS With SDMA: $ DRI_PRIME=1 glxgears 12765 frames in 5.0 seconds = 2552.788 FPS 12888 frames in 5.0 seconds = 2577.495 FPS The 1st GPU is irrelevant. The improvement should be much lower at 60 fps, but definitely measurable. SI will get this once we add SDMA blit support for it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-08-26 15:50:10 +02:00
Marek Olšák	a6b5845a0d	radeonsi: use current context for DCC feedback-loop decompress, fixes Elemental This is just a workaround. The problem is described in the code. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96541 v2: say that it's only between the current context and aux_context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-08-17 12:24:35 +02:00
Marek Olšák	7df15389af	gallium/radeon: handle render_condition_enable for clear_rt/ds Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:10:21 +02:00
Marek Olšák	a909210131	gallium: add render_condition_enable param to clear_render_target/depth_stencil Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:10:21 +02:00
Nicolai Hähnle	65d48fcf8c	radeonsi: silence Coverity warning Coverity's analysis is too weak to understand that r600_init_flushed_depth(_, _, NULL) only returns true when flushed_depth_texture was assigned a non-NULL value. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-13 09:52:39 +02:00
Nicolai Hähnle	1a0a8efcce	radeonsi: decompress to flushed depth texture when required v2: s/dirty_level_mask/stencil_dirty_level_mask/ in stencil case Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:51 +02:00
Nicolai Hähnle	4b7961da77	radeonsi: extract DB->CB copy logic into its own function Also clean up some of the looping. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:51 +02:00
Nicolai Hähnle	f2eb34f82f	gallium/radeon: replace is_flushing_texture with db_compatible This is a left-over of when I considered generalizing the separate stencil support. I do prefer the new name since it emphasizes what flushing vs. non-flushing means from a functional point-of-view, namely special handling of the texture format. v2: adjust r600_init_color_surface as well Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:43:48 +02:00
Nicolai Hähnle	065eeb79f7	radeonsi: correctly mark levels of 3D textures as fully decompressed Account for the fact that max_layer is minified for higher levels. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-06 10:42:49 +02:00
Marek Olšák	49e3c74cdd	gallium/radeon: add a heuristic enabling DCC for scanout surfaces (v2) DCC for displayable surfaces is allocated in a separate buffer and is enabled or disabled based on PS invocations from 2 frames ago (to let queries go idle) and the number of slow clears from the current frame. At least an equivalent of 5 fullscreen draws or slow clears must be done to enable DCC. (PS invocations / (width * height) + num_slow_clears >= 5) Pipeline statistic queries are always active if a color buffer that can have separate DCC is bound, even if separate DCC is disabled. That means the window color buffer is always monitored and DCC is enabled only when the situation is right. The tracking of per-texture queries in r600_common_context is quite ugly, but I don't see a better way. The first fast clear always enables DCC. DCC decompression can disable it. A later fast clear can enable it again. Enable/disable typically happens only once per frame. The impact is expected to be negligible because games usually don't have a high level of overdraw. DCC usually activates when too much blending is happening (smoke rendering) or when testing glClear performance and CMASK isn't supported (Stoney). v2: rename stuff, add assertions Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 20:12:00 +02:00
Marek Olšák	3eacbc52d5	radeonsi: boolean -> bool, TRUE -> true, FALSE -> false Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Marek Olšák	70a25478fe	radeonsi: use u_blitter for mipmap generation This reduces time spend in glGenerateMipmap by a half. v2: don't decompress the levels to be overwritten Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-21 13:52:05 +02:00
Marek Olšák	4eea710b0d	radeonsi: try to hit direct hw MSAA resolve by changing micro mode in clear We could also do MSAA resolve in a compute shader like Vulkan and remove these workarounds. v2: comment the magic numbers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Marek Olšák	373060652c	radeonsi: clarify the MSAA resolve limitation with scanout this is the correct hw requirement Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Marek Olšák	95288277d5	Revert "radeonsi: allow direct hw MSAA resolve for scanout surfaces" This reverts commit `ffd54d1936`. No, it doesn't work. The test case is "glxgears -samples 2".	2016-06-08 19:21:55 +02:00
Marek Olšák	7c6e88b643	radeonsi: allow MSAA resolving into a texture that has DCC enabled Since DCC is enabled almost everywhere now, it's important not to disable this fast path. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	ffd54d1936	radeonsi: allow direct hw MSAA resolve for scanout surfaces No idea why this was disabled, but it works fine. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00
Marek Olšák	4be46c7d9d	radeonsi: don't allocate DCC for the temporary MSAA resolve surface Allocating it has no effect, but it adds overhead (useless DCC clear). Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-06-08 00:22:45 +02:00

1 2 3

134 Commits