KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	0252fb92b8	radeonsi: add primitive culling stats to the HUD Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:36 -04:00
Marek Olšák	c9b7a37b8f	radeonsi: cull primitives with async compute for large draw calls Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:13:34 -04:00
Marek Olšák	07c83d25fd	radeonsi: add a cs parameter into si_cp_copy_data Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:57 -04:00
Marek Olšák	ce264d19a0	radeonsi: add a cs parameter into si_cp_release_mem Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:56 -04:00
Marek Olšák	9624855f13	radeonsi: add threadgroups_per_cu param into si_get_compute_resource_limits Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:54 -04:00
Marek Olšák	49a016ec5d	radeonsi: make si_initialize_compute reusable Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:51 -04:00
Marek Olšák	c44c6951d4	radeonsi: extract COMPUTE_RESOURCE_LIMITS code into a helper Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-05-16 13:06:49 -04:00
Marek Olšák	ccfcb9d818	ac: rename SI-CIK-VI to GFX6-GFX7-GFX8 Acked-by: Dave Airlie <airlied@redhat.com> We already use GFX9 and I don't want us to have confusing naming in the driver. GFXn naming is better from the driver perspective, because it's the real version of the gfx portion of the hw. Also, CIK means Bonaire-Kaveri-Kabini, it doesn't mean CI. It shouldn't confuse our SDMA, UVD, VCE etc. code much. Those have nothing to do with GFXn and they have their own version numbers.	2019-05-15 20:54:10 -04:00
Nicolai Hähnle	d814c21b1b	radeonsi: overhaul the vertex fetch fixup mechanism The overall goal is to support unaligned loads from vertex buffers natively on SI. In the unaligned case, we fall back to the general case implementation in ac_build_opencoded_load_format. Since this function is fully general, we will also use it going forward for cases requiring fully manual format conversions of dwords anyway. This requires a different encoding of the fix_fetch array, which will now contain the entire format information if a fixup is required. Having to check the alignment of vertex buffers is awkward. To keep the impact on the fast path minimal, the si_context will keep track of which vertex buffers are (not) at least dword-aligned, while the si_vertex_elements will note which vertex buffers have some (at most dword) alignment requirement. Vertex buffers should be dword-aligned most of the time, which allows a fast early-out in almost all cases. Add the radeonsi_vs_fetch_always_opencode configuration variable for testing purposes. Note that it can only be used reliably on LLVM >= 9, because support for byte and short load is required. v2: - add a missing check to si_bind_vertex_elements Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-05-13 17:07:23 +02:00
Marek Olšák	383f406591	radeonsi: remove dirty slot masks from scissor and viewport states All registers in the array need to be updated if any of them is changed. Only apps writing gl_ViewportIndex were affected by this bug.	2019-04-25 11:49:38 -04:00
Marek Olšák	440135e5a0	radeonsi/gfx9: rework the gfx9 scissor bug workaround (v2) Needed to track context rolls caused by streamout and ACQUIRE_MEM. ACQUIRE_MEM can occur outside of draw calls. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110355 v2: squashed patches and done more rework Cc: 19.0 <mesa-stable@lists.freedesktop.org>	2019-04-25 11:49:38 -04:00
Nicolai Hähnle	8bef4df196	radeonsi: add si_debug_options for convenient adding/removing of options Move the definition of radeonsi_clear_db_cache_before_clear there, as well as radeonsi_enable_nir. This removes the AMD_DEBUG=nir option. We currently still have two places for options: the driconf machinery and AMD_DEBUG/R600_DEBUG. If we are to have a single place for options, then the driconf machinery should be preferred since it's more flexible. The only downside of the driconf machinery was that adding new options was quite inconvenient. With this change, a simple boolean option can be added with a single line of code, same as for AMD_DEBUG. One technical limitation of this particular implementation is that while almost all driconf features are available, the translation machinery doesn't pick up the description strings for options added in si_debvug_options. In practice, translations haven't been provided anyway, and this is intended for developer options, so I'm not too worried. It could always be added later if anybody really cares. v2: - use bool instead of uint8_t for options - si_debug_options.inc -> si_debug_options.h Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2019-04-25 12:31:02 +02:00
Marek Olšák	951d60f8cd	radeonsi: delay adding BOs at the beginning of IBs until the first draw so that bound compute shader resources won't be added when they are not needed and same for graphics. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:36:36 -04:00
Marek Olšák	09bb8c8557	radeonsi: add helper si_get_minimum_num_gfx_cs_dwords Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:36:34 -04:00
Marek Olšák	c59d238bb0	radeonsi: add si_cp_copy_data Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-04-23 11:36:33 -04:00
Marek Olšák	b58e5fb6f3	radeonsi: use CP DMA for the null const buffer clear on CIK This is a workaround for a thread deadlock that I have no idea why it occurs. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108879 Fixes: `9b331e462e` Acked-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2019-04-22 16:05:52 -04:00
Marek Olšák	1f21396431	radeonsi: add support for displayable DCC for multi-RB chips A compute shader is used to reorder DCC data from aligned to unaligned.	2019-04-04 09:53:24 -04:00
Marek Olšák	029bfa3d25	radeonsi: add ability to bind images as image buffers so that we can bind DCC (texture) as an image buffer.	2019-04-04 09:53:24 -04:00
Marek Olšák	fe3bfd7971	radeonsi/gfx9: add support for PIPE_ALIGNED=0 Needed by displayable DCC. We need to flush L2 after rendering if PIPE_ALIGNED=0 and DCC is enabled.	2019-04-04 09:53:24 -04:00
Marek Olšák	b9e02fe138	gallium: add pipe_grid_info::last_block The OpenMAX state tracker will use this. RadeonSI is adapted to use pipe_grid_info::last_block instead of its internal state. Acked-by: Leo Liu <leo.liu@amd.com>	2019-03-15 11:53:08 -04:00
Marek Olšák	a1378639ab	radeonsi: always use compute rings for clover on CI and newer (v2) initialize all non-compute context functions to NULL. v2: fix SI	2019-02-26 14:58:55 -05:00
Marek Olšák	edbd2c1ff5	radeonsi: use SDMA for uploading data through const_uploader v2: use tc.stream_uploader in si buffer_transfer_map if not called from the driver thread Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1) Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-02-20 21:04:29 -05:00
Marek Olšák	5068dec5de	radeonsi: clear allocator_zeroed_memory with SDMA so that it can be used in parallel IBs. This also removes the SO_FILLED_SIZE hack. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Marek Olšák	7d4c935654	radeonsi: initialize textures using DCC to black when possible Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2019-02-06 11:17:21 -05:00
Marek Olšák	a03ecbaeec	radeonsi: handle render_condition_enable in si_compute_clear_render_target	2019-02-04 18:46:25 -05:00
Sonny Jiang	984fd73515	radeonsi: use compute for clear_render_target when possible Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-02-04 18:46:25 -05:00
Marek Olšák	260ff57647	radeonsi: rename rbo, rbuffer to buf or buffer Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 13:34:01 -05:00
Marek Olšák	501ff90a95	radeonsi: rename r600_resource -> si_resource Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 13:32:18 -05:00
Marek Olšák	1cfbed7587	radeonsi: remove r600 from comments Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 12:26:45 -05:00
Sonny Jiang	1b25d340b7	radeonsi: use compute for resource_copy_region when possible v2: marek: fix snorm8 blits Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2019-01-22 12:24:35 -05:00
Jiang, Sonny	8daf5bb209	radeonsi: add compute_last_block to configure the partial block fields	2019-01-22 12:22:46 -05:00
Marek Olšák	4d5f8f39f3	radeonsi: move PKT3_WRITE_DATA generation into a helper function Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2019-01-22 12:14:26 -05:00
Marek Olšák	54bc87469a	radeonsi: make si_cp_wait_mem more configurable Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-01-02 15:01:54 -05:00
Marek Olšák	d28e208213	radeonsi: don't emit redundant PKT3_NUM_INSTANCES packets Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2019-01-02 15:01:50 -05:00
Nicolai Hähnle	e2b9329f17	radeonsi: move remaining perfcounter code into si_perfcounter.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:57 +01:00
Nicolai Hähnle	5c841a1b1e	radeonsi: rename SI_RESOURCE_FLAG_FORCE_TILING to clarify its purpose Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-12-19 12:01:39 +01:00
Marek Olšák	075fd5d8f2	radeonsi: add memory management stress tests for GDS Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-28 20:20:27 -05:00
Marek Olšák	d7a4fa91f0	radeonsi: allow si_cp_dma_clear_buffer to clear GDS from any IB Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-11-28 20:20:27 -05:00
Marek Olšák	9dc776f3f2	radeonsi: don't set the CB clear color registers for 0/1 clear colors on Raven2 and add has_dcc_constant_encode.	2018-11-09 14:55:04 -05:00
Marek Olšák	99835fff08	radeonsi/gfx9: set optimal OVERWRITE_COMBINER_WATERMARK	2018-10-30 16:03:02 -04:00
Marek Olšák	77bcbe712e	radeonsi: clamp point size to the limit This fixes dEQP-GLES2.functional.rasterization.limits.points. Broken by: `ea039f789d` Tested-by: Jakob Bornecrantz <jakob@collabora.com>	2018-10-18 16:08:56 -04:00
Marek Olšák	fcc70e4855	radeonsi: track context rolls better for the Vega scissor bug workaround We should get fewer context rolls with the SET_CONTEXT_REG optimization, but it would have been for nothing if the scissor state rolled the context anyway. Don't emit the scissor state if there is no context roll.	2018-10-16 17:23:25 -04:00
Marek Olšák	9b331e462e	radeonsi: use compute shaders for clear_buffer & copy_buffer Fast color clears should be much faster. Also, fast color clears on evicted buffers should be 200x faster on GFX8 and older.	2018-10-16 17:23:25 -04:00
Marek Olšák	ea039f789d	radeonsi: use higher subpixel precision (QUANT_MODE) for smaller viewports	2018-10-16 15:28:22 -04:00
Marek Olšák	41a6c3de1f	radeonsi: don't re-upload the sample position constant buffer repeatedly	2018-10-16 15:28:22 -04:00
Marek Olšák	fedc1fda30	radeonsi: save raster config in screen, add se_tile_repeat	2018-10-16 15:28:22 -04:00
Marek Olšák	67f02cf810	radeonsi: add GDS support to CP DMA	2018-10-16 15:28:22 -04:00
Marek Olšák	0d05581578	radeonsi: rename si_gfx_* functions to si_cp_* and write_event_eop -> release_mem	2018-10-16 15:28:22 -04:00
Marek Olšák	6e1cf6532d	radeonsi: make si_gfx_write_event_eop more configurable	2018-10-16 15:28:22 -04:00
Marek Olšák	203ef19f48	radeonsi: split si_copy_buffer compute and SDMA will be added into it. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	1119fe5c25	radeonsi: merge SI and CI dma_clear_buffer and remove the callback also use assertions for the requirements that offset and size are a multiple of 4. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-09-10 15:19:56 -04:00
Marek Olšák	93b8b987d0	radeonsi: add a thorough clear/copy_buffer benchmark	2018-08-29 15:31:42 -04:00
Marek Olšák	5914f5bd4a	radeonsi: let internal compute dispatches tune WAVES_PER_SH	2018-08-29 15:31:42 -04:00
Marek Olšák	c5442c1165	radeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSI	2018-08-29 15:31:42 -04:00
Marek Olšák	c359880d8b	radeonsi: add SI_QUERY_TIME_ELAPSED_SDMA for measuring SDMA performance	2018-08-29 15:31:42 -04:00
Marek Olšák	0c5429cc73	radeonsi: add flag L2_STREAM for minimal cache usage	2018-08-29 15:31:41 -04:00
Marek Olšák	df50099834	radeonsi: use radeon_info::name Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-08-14 21:20:31 -04:00
Marek Olšák	de8d5edbc4	radeonsi: split si_clear_buffer to remove enum si_method Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:21:12 -04:00
Marek Olšák	4de92f2abb	radeonsi: replace CP_DMA_USE_L2 with enum si_cache_policy Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:21:10 -04:00
Marek Olšák	ac72a6bd0b	radeonsi: move internal TGSI shaders into si_shaderlib_tgsi.c Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:31 -04:00
Marek Olšák	0ca8294ece	radeonsi: implement EXT_window_rectangles Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:19:02 -04:00
Marek Olšák	4bad50ded9	radeonsi: cosmetic changes	2018-08-04 03:10:30 -04:00
Darren Powell	726a48c94f	radeonsi: add new R600_DEBUG test "testclearbufperf" Signed-off-by: Darren Powell <darren.powell@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-08-02 16:09:22 -04:00
Marek Olšák	20dd75a926	radeonsi: use storage_samples instead of color_samples in most places and use pipe_resource::nr_storage_samples instead of r600_texture::num_color_samples. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-07-31 18:28:41 -04:00
Tom Stellard	0866edede0	radeonsi: Add debug option to enable LLVM GlobalISel (v2) R600_DEBUG=gisel will tell LLVM to use GlobalISel rather than SelectionDAG for instruction selection. v2: mareko: move the helper to src/amd/common Signed-off-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Tom Stellard <tstellar@redhat.com>	2018-07-23 20:23:48 -04:00
Dave Airlie	0eb65b4944	radeonsi: rename si_compiler -> ac_llvm_compiler As precursor to moving init to common code, just rename the struct and move it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:31:32 +10:00
Marek Olšák	bd963f8430	radeonsi: rename r600_transfer -> si_transfer Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	d4755ef389	radeonsi: remove redundant si_texture::cmask_size cmask_buffer and surface.cmask_size can replace its role. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	2a8d1039b6	radeonsi: inline struct r600_cmask_info Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	166250f4e5	radeonsi: move CMASK size computation into ac_surface Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	2d64a68c6f	radeonsi: rename r600_surface -> si_surface Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	218e133695	radeonsi: rename r600_memory_object -> si_memory_object Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	e5df04f13d	radeonsi: remove unused r600_memory_object::offset The real offset is passed through resource_from_memobj. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	7bd40dc2f2	radeonsi: clean up some #includes Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Grazvydas Ignotas	f966929805	radeonsi: add a debug flag to zero vram allocations This allows to avoid having to see garbage in Dying Light loading screen at least, which probably expects Windows/NV behavior of all allocations being zeroed by default. Analogous to radv flag with the same name. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-21 12:18:50 +03:00
Marek Olšák	1ba87f4438	radeonsi: rename r600_texture -> si_texture, rxxx -> xxx or sxxx Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-19 13:08:50 -04:00
Marek Olšák	6703fec58c	amd,radeonsi: rename radeon_winsys_cs -> radeon_cmdbuf Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-19 13:08:50 -04:00
Marek Olšák	dfeb61c5cf	radeonsi: ignore PIPE_RESOURCE_FLAG_MAP_COHERENT We treat coherent and non-coherent buffers the same. And move external_usage for better packing. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-19 12:52:28 -04:00
Marek Olšák	f3b3ee6974	radeonsi: micro-optimize prim checking and fix guardband with lines+adjacency Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:34 -04:00
Marek Olšák	73b0d10152	radeonsi: don't set VGT_LS_HS_CONFIG if it doesn't change Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:25 -04:00
Marek Olšák	28ee825e19	radeonsi: move VGT_GS_OUT_PRIM_TYPE into si_shader_gs same as amdvlk. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:23 -04:00
Sonny Jiang	43b0269ce3	radeonsi: emit_db_render_state packets optimization Remembering latest states of registers to eliminate redunant SET_CONTEXT_REG packets Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-06-07 23:26:25 -04:00
Timothy Arceri	03c370d2f1	radeonsi: fix possible truncation on renderer string Fixes truncation warning in gcc 8.1 Fixes: `8539c9bf31` ("gallium/radeon: add the kernel version into the renderer string") Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2018-06-08 10:07:55 +10:00
Marek Olšák	b936f9aa32	radeonsi: disable primitive binning for all blitter ops same as amdvlk. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:56 -04:00
Marek Olšák	a969f184cf	radeonsi: add an environment variable that forces EQAA for MSAA allocations This is for testing and experiments. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:37 -04:00
Marek Olšák	7ac4ef097d	radeonsi: add EQAA SC,DB,CB register programming Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:34 -04:00
Marek Olšák	9d00580e75	radeonsi: support creating EQAA color textures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:34:32 -04:00
Marek Olšák	835095973d	radeonsi: remove r600_fmask_info radeon_surf contains almost everything. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-10 18:26:33 -04:00
Marek Olšák	8b7358fe43	radeonsi: increase the number of compiler threads depending on the CPU The compiler queue was limited to 3 threads, so shader-db running on a 16-thread CPU would have a bottleneck on the 3-thread queue. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	797d673c9a	radeonsi: move passmgr into si_compiler Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	87eb597758	radeonsi: add struct si_compiler containing LLVMTargetMachineRef It will contain more variables. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	788d66553a	radeonsi: rename r600_texture::resource to buffer r600_resource could be renamed to si_buffer. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	6fadfc01c6	radeonsi: use r600_resource() typecast helper Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	de344209ad	radeonsi: inline 2 trivial state structures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	ccebcba893	radeonsi: remove si_atom::id Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	639b673fc3	radeonsi: don't use an indirect table for state atoms Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	9054799b39	radeonsi: rename r600_atom -> si_atom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	a8abbbb172	radeonsi: remove r600_pipe_common.h Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	c732d069b3	radeonsi: implement DCC fast clear swizzle constraints more accurately Reduce swizzle constraints to the ALPHA_IS_ON_MSB constraint and the clear value of 1. This significantly changes the DCC fast clear code, and fixes fast clear for RGB formats without alpha. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	1cc2e0cc6b	radeonsi: fully enable 2x DCC MSAA for array and non-array textures The clear code is exactly the same as for 1 sample buffers - just clear the whole thing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	60299e9abe	radeonsi: don't emit partial flushes for internal CS flushes only Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00
Marek Olšák	1b3199d14d	radeonsi: implement mechanism for IBs without partial flushes at the end (v6) (This patch doesn't enable the behavior. It will be enabled in a later commit.) Draw calls from multiple IBs can be executed in parallel. v2: do emit partial flushes on SI v3: invalidate all shader caches at the beginning of IBs v4: don't call si_emit_cache_flush in si_flush_gfx_cs if not needed, only do this for flushes invoked internally v5: empty IBs should wait for idle if the flush requires it v6: split the commit If we artificially limit the number of draw calls per IB to 5, we'll get a lot more IBs, leading to a lot more partial flushes. Let's see how the removal of partial flushes changes GPU utilization in that scenario: With partial flushes (time busy): CP: 99% SPI: 86% CB: 73: Without partial flushes (time busy): CP: 99% SPI: 93% CB: 81% Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00
Marek Olšák	918b798668	radeonsi: make sure CP DMA is idle at the end of IBs	2018-04-13 14:07:20 -04:00
Marek Olšák	9a1363427e	radeonsi: always prefetch later shaders after the draw packet so that the draw is started as soon as possible. v2: only prefetch the API VS and VBO descriptors Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-13 12:31:04 -04:00
Bas Vermeulen	be628e4749	radeonsi: correct si_vgt_param_key on big endian machines Using mesa OpenCL failed on a big endian PowerPC machine because si_vgt_param_key is using bitfields and a 32 bit int for an index into an array. Fix si_vgt_param_key to work correctly on both little endian and big endian machines. Signed-off-by: Bas Vermeulen <bas@daedalean.ai> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-04-09 13:42:30 -04:00
Marek Olšák	c7dd59b06d	radeonsi: fix a crash if ps_shader.cso is NULL in si_get_total_colormask	2018-04-05 15:53:52 -04:00
Marek Olšák	6a93441295	radeonsi: remove r600_common_context Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	321bd6c280	radeonsi: move r600_buffer_common.c and r600_texture.c into radeonsi Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	d58080b318	radeonsi: move r600_gpu_load.c to si_gpu_load.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	f7f4ba5306	radeonsi: move r600_query.c/h files to si_query.c/h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5777488406	radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	72e9e98076	radeonsi: move and rename R600_ERR out of r600_pipe_common.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	076afb4f0e	radeonsi: rename a few R600/r600_ -> SI_/si_ Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5f1cddde78	radeonsi: move definitions out of r600_pipe_common.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	a67ee02388	radeonsi: move functions out of and remove r600_pipe_common.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	90d12f1d77	radeonsi: rename r600 -> si in some places Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	50c7aa6756	radeonsi: use si_context instead of pipe_context in parameters pt3 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	e332ba61f4	radeonsi: use si_context instead of pipe_context in parameters pt2 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	c424f86180	radeonsi: use si_context instead of pipe_context in parameters pt1 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	4c5efc40f4	radeonsi: update copyrights Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	3069cb8b78	radeonsi: use r600_common_context less pt2 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	71d9028b7a	radeonsi: use r600_common_context less pt1 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	c0987d8adf	radeonsi: move saved_cs functions from r600_pipe_common.c to si_debug.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	37ef4765ff	radeonsi: move DMA CS functions from r600_pipe_common.c to si_dma_cs.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	19f550f1d2	radeonsi: move EOP event code from r600_pipe_common.c to si_fence.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	fc6a44e169	radeonsi: rename si_hw_context.c -> si_gfx_cs.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	42500d1dab	radeonsi: move si_destroy_saved_cs to si_debug.c Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	02a61e71a2	radeonsi: rename si_begin_new_cs -> si_begin_new_gfx_cs Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	fa09388704	radeonsi: rename si_need_cs_space -> si_need_gfx_cs_space Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	85e75b2da5	radeonsi: remove r600_pipe_common::blit_decompress_depth Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	e04389cc2a	radeonsi: remove r600_pipe_common::decompress_dcc Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	17e8f1608e	radeonsi: call CS flush functions directly whenever possible Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	0669dca9c0	radeonsi: skip DCC render feedback checking if color writes are disabled	2018-04-05 15:34:58 -04:00
Marek Olšák	2be6143032	radeonsi: implement GL_KHR_blend_equation_advanced MSAA is supported using sample shading. Layered rendering and all texture targets are also supported. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:25 -04:00
Marek Olšák	eb77961292	radeonsi: add R600_DEBUG=nofmask to disable MSAA compression For testing. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:20 -04:00
Marek Olšák	8d6e6b1d7c	radeonsi: don't use struct si_descriptors for vertex buffer descriptors VBO descriptor code will change a lot one day. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:00 +01:00
Marek Olšák	fca7dee9c6	radeonsi: put both tessellation rings into 1 buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	d2963d8b5f	radeonsi: move tessellation ring info into si_screen Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	3e1287caef	radeonsi/gfx9: make shader binaries use read-only memory Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-12-06 15:19:02 +01:00
Marek Olšák	950221f923	radeonsi: remove r600_common_screen Most files in gallium/radeon now include si_pipe.h. chip_class and family are now here: sscreen->info.family sscreen->info.chip_class Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	2208b760f3	radeonsi: move shader debug helpers out of r600_pipe_common.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	03e2adc990	radeonsi: move all get functions to si_get.c; disk_cache_create to si_pipe.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	1823bbbb1a	radeonsi: remove R600_CONTEXT_* flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	c63e225bff	radeonsi: remove some definitions and helpers from r600_pipe_common.h Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	b191e2d79d	radeonsi: move r600_test_dma.c into si_test_dma.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	7aa2366b70	radeonsi: move all clear() code into si_clear.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Nicolai Hähnle	1a6d9e087a	radeonsi: record and dump time of flush Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:01:04 +01:00
Nicolai Hähnle	609a230375	gallium/u_threaded: implement asynchronous flushes This requires out-of-band creation of fences, and will be signaled to the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag. v2: - remove an incorrect assertion - handle fence_server_sync for unsubmitted fences by relying on the improved cs_add_fence_dependency - only implement asynchronous flushes on amdgpu Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:42 +01:00
Nicolai Hähnle	78a4750d91	radeonsi: move fence functions to si_fence.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:42 +01:00
Nicolai Hähnle	dd7c273e87	radeonsi: move pipe debug callback to si_context Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:53:19 +01:00
Marek Olšák	33000e7c43	radeonsi: add si_screen::has_ls_vgpr_init_bug Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:58:40 +01:00
Marek Olšák	529cdce799	radeonsi: remove 'Authors:' comments It's inaccurate. Instead, see the copyright and use "git log" and "git blame" to know the authorship. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-02 18:19:03 +01:00
Samuel Pitoiset	dd79aa4ad3	radeonsi: update hack for HTILE corruption in ARK: Survival Evolved It appears that flushing the DB metadata is actually not sufficient since the driver uses the new VS blit shaders. This looks quite strange though, but it seems like we need to flush DB for fixing the corruption. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102955 Fixes: `69ccb9dae7` (radeonsi: use new VS blit shaders (VS inputs in SGPRs) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-27 10:47:30 +02:00
Marek Olšák	0ecf9b90ef	radeonsi: import cayman_msaa.c from drivers/radeon Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-09 16:27:04 +02:00
Marek Olšák	65f2e33500	radeonsi: import r600_streamout from drivers/radeon Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-09 16:26:55 +02:00
Marek Olšák	13b6c1c031	radeonsi: minor cleanup of si_update_vs_writes_viewport_index Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-07 18:26:35 +02:00
Marek Olšák	69ccb9dae7	radeonsi: use new VS blit shaders (VS inputs in SGPRs) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-07 18:26:35 +02:00
Marek Olšák	6a8401a94e	radeonsi: add VS blit shader creation no users yet Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-07 18:26:35 +02:00
Marek Olšák	de810f8b84	radeonsi: remove wrappers si_decompress_xx_textures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-07 18:26:35 +02:00
Marek Olšák	c4d1a199f8	radeonsi: add a drirc workaround for HTILE corruption in ARK: Survival Evolved v2: use DB_META \| PS_PARTIAL_FLUSH Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102955 Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2017-10-06 02:56:11 +02:00
Marek Olšák	15d918e46f	radeonsi: inline struct si_sampler_views Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-06 02:56:11 +02:00
Marek Olšák	23cdde5138	radeonsi: rename si_textures_info -> si_samplers, si_images_info -> si_images Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-06 02:56:11 +02:00
Nicolai Hähnle	63680471f9	radeonsi: remove si_context::{scissor_enabled,clip_halfz} They are just copies of the rasterizer state. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-02 15:07:45 +02:00
Nicolai Hähnle	12f3155e28	radeonsi: simplify the signature of si_update_vs_writes_viewport_index Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-02 15:07:45 +02:00
Nicolai Hähnle	7bbcb6ac6c	radeonsi: move current_rast_prim into si_context v2: rebase fixes Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-02 15:07:45 +02:00
Nicolai Hähnle	6b416ec3d6	radeonsi: move and rename scissor and viewport state and functions v2: change GET_MAX_SCISSOR to SI_MAX_SCISSOR Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-02 15:07:45 +02:00
Nicolai Hähnle	f86a112b07	radeonsi: move current_rast_prim to r600_common_context We'll use it in the scissors / clip / guardband state. v2: avoid a performance regression on r600 when applied to (pre-fork) stable branches Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-02 15:07:43 +02:00
Nicolai Hähnle	797dd12c7b	radeonsi: fix border color translation for integer textures This fixes the extremely unlikely case that an application uses 0x80000000 or 0x3f800000 as border color for an integer texture and helps in the also, but perhaps slightly less, unlikely case that 1 is used as a border color. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:45:08 +02:00
Nicolai Hähnle	4c56e07029	radeonsi: clamp depth comparison value only for fixed point formats The hardware usually does this automatically. However, we upgrade depth to Z32_FLOAT to enable TC-compatible HTILE, which means the hardware no longer clamps the comparison value for us. The only way to tell in the shader whether a clamp is required seems to be to communicate an additional bit in the descriptor table. While VI has some unused bits in the resource descriptor, those bits have unfortunately all been used in gfx9. So we use an unused bit in the sampler state instead. Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f and many other tests in dEQP-GLES3.functional.texture.shadow.* Fixes: `d4d9ec55c5` ("radeonsi: implement TC-compatible HTILE") Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:44:50 +02:00
Nicolai Hähnle	7a62f8621a	radeonsi: allow out-of-order rasterization in commutative blending cases We do not enable this by default for additive blending, since it slightly breaks OpenGL invariance guarantees due to non-determinism. Still, there may be some applications can benefit from white-listing via the radeonsi_commutative_blend_add drirc setting without any real visible artifacts. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-18 11:25:20 +02:00
Nicolai Hähnle	8c56c45cd4	radeonsi: add drirc option "radeonsi_assume_no_z_fights" This option enables a performance optimization where typical non-blending draws with depth buffer may be rasterized out-of-order (on VI+, multi-SE chips). This optimization can lead to incorrect results when an applications renders multiple objects with the same Z value at the same pixel, so we will never enable it by default. But there may be applications that could benefit from white-listing. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-18 11:25:19 +02:00
Nicolai Hähnle	aab134cfa5	radeonsi: enable out-of-order rasterization when possible on VI and GFX9 dGPUs This does not take commutative blending into account yet. R600_DEBUG=nooutoforder disables it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-18 11:25:19 +02:00
Nicolai Hähnle	6772452e4c	amd/common: remove has_ds_bpermute argument from ac_build_ddxy Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-18 11:25:18 +02:00
Nicolai Hähnle	45c5c44451	radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug When the HS wave is empty, the hardware writes the LS VGPRs starting at v0 instead of v2. Workaround by shifting them back into place when necessary. For simplicity, this is always done in the LS prolog. According to the hardware team, this will be fixed in future chips, so take that into account already. Note that this is not a bug fix, as the bug was already worked around by commit `166823bfd2` ("radeonsi/gfx9: add a temporary workaround for a tessellation driver bug"). This change merely replaces the workaround by one that should be better. v2: add workaround code to shader only when necessary v3: clarify the prefer_mono comment Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-06 10:02:49 +02:00
Nicolai Hähnle	34124e412f	radeonsi/gfx9: always flush DB metadata on framebuffer changes This fixes GL45-CTS.shader_image_load_store.basic-glsl-earlyFragTests. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-06 09:57:08 +02:00
Marek Olšák	c3ebac6890	radeonsi/gfx9: implement primitive binning This increases performance, but it was tuned for Raven, not Vega. We don't know yet how Vega will perform, hopefully not worse. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-05 12:09:02 +02:00
Marek Olšák	0797eea758	radeonsi/gfx9: don't use BREAK_BATCH and FLUSH_DFSM if DFSM is disabled Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-05 12:09:02 +02:00
Marek Olšák	7dec48b81e	radeonsi/gfx9: don't flush L2 metadata for DB if not needed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	aa64e24cb1	radeonsi/gfx9: don't flush L2 metadata for CB if not needed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	5b62eb237c	radeonsi/gfx9: don't flush TC L2 between rendering and texturing if not needed Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	113278ee79	radeonsi: remove Constant Engine support We have come to the conclusion that it doesn't improve performance. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-22 13:29:47 +02:00
Samuel Pitoiset	39a35eb0c1	radeonsi: try to re-use previously deleted bindless descriptor slots Currently, when the array is full it is resized but it can grow over and over because we don't try to re-use descriptor slots. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 11:34:37 +02:00
Samuel Pitoiset	c2dfa9b111	radeonsi: use slot indexes for bindless handles Using VRAM address as bindless handles is not a good idea because we have to use LLVMIntToPTr and the LLVM CSE pass can't optimize because it has no information about the pointer. Instead, use slots indexes like the existing descriptors. Note that we use fixed 16-dword slots for both samplers and images. This doesn't really matter because no real apps use image handles. This improves performance with DOW3 by +7%. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 11:34:29 +02:00
Nicolai Hähnle	420c438589	radeonsi: log draw and compute state into log context Also add missing trace emits and CS logging for compute launches. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 09:53:34 +02:00
Nicolai HÃ¤hnle	4c3f36ec6b	radeonsi: print saved CS to the log context Use the auto logger facility, so that CS chunks will be interleaved with other log info. v2: - fix some crashes when not using CE - fix skipping "previous" chunks of current (unflushed) IB - fix error handling in si_begin_cs_debug Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 09:53:14 +02:00
Marek Olšák	13aa8d3da9	radeonsi: don't use CLEAR_STATE on SI This fixes random hangs with Unigine Valley. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102201 Fixes: `064550238e` ("radeonsi: use CLEAR_STATE to initialize some registers") Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-18 15:59:22 +02:00
Marek Olšák	c093821cee	radeonsi: rename shader_userdata -> shader_pointers where appropriate Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-07 21:12:24 +02:00
Marek Olšák	e887c68bd2	radeonsi: add a separate dirty mask for prefetches so that we don't rely on si_pm4_state_enabled_and_changed, allowing us to move prefetches after draw calls. v2: ckear the dirty mask after unbinding shaders Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2017-08-07 21:12:24 +02:00
Marek Olšák	58d062b87d	radeonsi: de-atomize L2 prefetch I'd like to be able to move the prefetch call site around. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-07 21:12:24 +02:00
Marek Olšák	1aeafb59e6	radeonsi: print CE IBs into ddebug reports Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-01 17:06:38 +02:00
Marek Olšák	f4d095cc65	radeonsi: update dirty_level_mask only when flushing or unbinding framebuffer This fixes corruption with bindless textures in Dawn Of War 3. The do_update_surf_dirtiness mechanism was complicated and dirty_level_mask was only updated after the first draw call. The problem is bindless textures are checked for decompression every draw call and we would only decompress after the first draw call. The solution is to set dirtiness after the last draw call to the framebuffer, so the (unconditional) decompression of bindless textures happens at the right time. Cc: 17.2 <mesa-stable@lists.freedesktop.org> Tested-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-07-28 16:34:24 +02:00
Marek Olšák	28c7fbbe0f	radeonsi: rely on CLEAR_STATE for clearing UCP and blend color registers Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-07-28 08:03:24 +02:00
Marek Olšák	ed2b3f5c81	radeonsi: decrease the number of compiler threads Cc: 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-07-26 19:53:26 +02:00
Marek Olšák	facfab28fe	radeonsi/gfx9: add workarounds to avoid VGPR indexing completely For inputs and outputs, indirect indexing is lowered by the GLSL compiler. For temporaries, use alloca and disable the "promote-alloca" pass. In the future, we could switch all codepaths to alloca permanently and just rely on the "promote-alloca" pass. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-07-17 10:50:39 -04:00
Marek Olšák	79bd1d4f8b	radeonsi/gfx9: keep reusing the same buffer/address for the gfx9 flush fence instead of using a monotonic suballocator v2: initialize the memory at context creation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-22 01:51:02 +02:00
Marek Olšák	2263610827	radeonsi: flush DB caches only when transitioning from DB to texturing Use the mechanism of si_decompress_textures, but instead of doing the actual decompression, just flag the DB cache flush there. This removes a lot of unnecessary DB cache flushes. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-06-22 01:51:02 +02:00
Samuel Pitoiset	f00e80e3f7	radeonsi: keep track of the sampler state for texture handles Needed for updating all resident texture descriptors when dirty_tex_counter changes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-20 10:14:52 +02:00
Samuel Pitoiset	6ff6863c32	radeonsi: reduce overhead for resident textures which need color decompression This is done by introducing a separate list. si_decompress_textures() is now 5x faster. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-18 14:10:38 +02:00
Samuel Pitoiset	06ed251c32	radeonsi: reduce overhead for resident textures which need depth decompression This is done by introducing a separate list. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-18 14:10:36 +02:00
Samuel Pitoiset	811756dfd0	radeonsi: upload new descriptors when resident buffers are invalidated When texture buffers are invalidated the addr in the resident descriptor has to be updated but we can't create a new descriptor because the resident handle has to be the same. Instead, use the WRITE_DATA packet which allows to update memory directly but graphics/compute have to be idle in case the GPU is reading the descriptor. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-06-14 10:04:36 +02:00

... 2 3 4 5 6 ...

601 Commits