KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	1a24f443b4	radeonsi: implement fast stencil clear Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-12-11 15:25:12 +01:00
Nicolai Hähnle	ad22006892	radeonsi: implement AMD_performance_monitor for CIK+ Expose most of the performance counter groups that are exposed by Catalyst. Ideally, the driver will work with GPUPerfStudio at some point, but we are not quite there yet. In any case, this is the reason for grouping multiple instances of hardware blocks in the way it is implemented. The counters can also be shown using the Gallium HUD. If one is interested to see how work is distributed across multiple shader engines, one can set the environment variable RADEON_PC_SEPARATE_SE=1 to obtain finer-grained performance counter groups. Part of the implementation is in radeon because an implementation for older hardware would largely follow along the same lines, but exposing a different set of blocks which are programmed slightly differently. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2015-11-25 15:52:09 +01:00
Marek Olšák	b1c5f3faa9	radeonsi: calculate optimal GS ring sizes to fix GS hangs on Tonga I discovered that increasing the ESGS ring size fixes GS hangs on Tonga, so let's do it properly. There is now a separate init_config_gs_rings state that is not immutable, because GS rings are resized when needed. This also saves some memory. Most apps won't need more than 1MB per ring per shader engine. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:42 +01:00
Marek Olšák	3d963abc81	radeonsi: prevent recursion in si_context_gfx_flush The recursion can only occur if you modify need_cs_space to always flush. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Marek Olšák	c6012a6650	radeonsi: rename cache flushing flags once more KCACHE, TC L1 and TC L2 are renamed to: - SMEM L1 - VMEM L1 - GLOBAL L2 You can easily tell what they are used for now. Shaders must deal with coherency issues between both L1s manually, e.g. by setting GLC=1 or by using s_dcache_*. BOTH_ICACHE_KCACHE was an unused definition. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2015-11-13 19:54:41 +01:00
Bas Nieuwenhuizen	48b5f104ac	radeonsi: Enable DCC. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-10-24 00:42:30 +02:00
Marek Olšák	06083046a4	radeonsi: add another requirement for PARTIAL_ES_WAVE Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	07b3cc6ecf	radeonsi: allow unbinding pixel shaders and remove the dummy shader Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-24 00:01:20 +02:00
Marek Olšák	9b54ce3362	radeonsi: support thread-safe shaders shared by multiple contexts The "current" shader pointer is moved from the CSO to the context, so that the CSO is mostly immutable. The only drawback is that the "current" pointer isn't saved when unbinding a shader and it must be looked up when the shader is bound again. This is also a prerequisite for multithreaded shader compilation. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-20 12:51:51 +02:00
Marek Olšák	13e69805ea	radeonsi: fix a GS hang on VI Broken by one of the cleanups: `0d46c3bc9d` Not applicable to stable. Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-10-07 19:18:50 +02:00
Marek Olšák	9652bfcf2d	radeonsi: implement the simple case of force_persample_interp Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	214de2d815	radeonsi: move SPI_PS_INPUT_ENA/ADDR registers to a separate state This will be a derived state used for changing center->sample and centroid->sample at runtime. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:09 +02:00
Marek Olšák	c23c92c965	radeonsi: only do depth-only or stencil-only in-place decompression instead of always doing both. Usually, only depth is needed, so stencil decompression is useless. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	cc92b90375	radeonsi: dump buffer lists while debugging Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:08 +02:00
Marek Olšák	9bd7928a35	radeonsi: add an option for debugging VM faults Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:07 +02:00
Marek Olšák	a9971e85d9	radeonsi: rework uploading border colors The border colors are uploaded only once when the state is created. This brings truly immutable sampler descriptors, because they don't have to be updated every time a sampler state is re-bound. It also moves the TA_BC_BASE_ADDR registers to init_config, removing one more state. The catch is there is now a limit: only 4096 border colors can be used by one context. I don't think that will be a problem. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	228e80123a	radeonsi: reorder si_context variables Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	28b34b474e	radeonsi: don't send IB dword usage to si_need_cs_space Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	ec9d5e181e	radeonsi: don't count IB space for states, just use an upper bound Since we don't put any resource descriptors in IBs, the space used by draw calls is quite small. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	fc95058add	radeonsi: convert SPI state to an atom Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:15 +02:00
Marek Olšák	45e549fcbc	radeonsi: convert CB_TARGET_MASK setup to an atom Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	e21418f221	radeonsi: convert stencil ref state into an atom Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	c44de30979	radeonsi: convert blend color state into an atom Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	74aa64876b	radeonsi: convert sample mask state into an atom Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	12b205341a	radeonsi: convert clip state into an atom Reducing calloc overhead. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	0c2eed0ede	radeonsi: avoid redundant CB and DB register updates The main idea is to avoid setting CB_COLORi_INFO = 0 for i>0 repeatedly when those colorbuffers aren't used. This is mainly for glamor. Same for DB. Z_INFO and STENCIL_INFO need to be cleared only once. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	c2a42d1f9f	radeonsi: don't rebind GSVS ring buffers every draw call using GS Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	a2c6ae07b4	radeonsi: remove the tf_ring state, add the registers to init_config One less state to worry about. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	0d46c3bc9d	radeonsi: remove the gs_rings state, add the registers to init_config Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	87c1e9e19c	radeonsi: use a bitmask for tracking dirty atoms This mainly removes the cache misses when checking the dirty flags. Not much else though. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:14 +02:00
Marek Olšák	ba7a6cf626	radeonsi: define the state atom array separately Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	8a97528b3a	radeonsi: optimize viewport states same as scissors Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	f6a10f60b7	radeonsi: optimize scissor states - convert 16 states to 1 atom - only emit 1 scissor if VIEWPORT_INDEX isn't written - use only one packet when emitting consecutive scissors Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Acked-by: Christian König <christian.koenig@amd.com>	2015-09-01 21:51:13 +02:00
Marek Olšák	2c14a6d3b1	radeonsi: add IB tracing support for debug contexts This adds trace points to all IBs and the parser prints them and also prints which trace points were reached (executed) by the CP. This can help pinpoint a problematic packet, draw call, etc. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:19 +02:00
Marek Olšák	189953ee13	radeonsi: remove old CS tracing code Some of it is left there and it will be re-used in the next commit. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:19 +02:00
Marek Olšák	be6dc87776	radeonsi: save the contents of indirect buffers for debug contexts This will be used by the IB parser. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:19 +02:00
Marek Olšák	110873ed11	radeonsi: add an initial dump_debug_state implementation dumping shaders This is usually called after a draw call. Acked-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2015-08-26 19:25:18 +02:00
Grazvydas Ignotas	3206d4ed44	gallium/radeon: use helper functions to mark atoms dirty This is analogous to r300_mark_atom_dirty() used by r300, and will be used by later patches. For common radeon code, appropriate helper is called through a function pointer. No functional changes. Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2015-08-11 14:46:53 +02:00
Marek Olšák	2d3ae154ba	radeonsi: move CP DMA functions to their own file Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-07-31 16:49:17 +02:00
Marek Olšák	b0528118df	radeonsi: completely rework updating descriptors without CP DMA The patch has a better explanation. Just a summary here: - The CPU always uploads a whole descriptor array to previously-unused memory. - CP DMA isn't used. - No caches need to be flushed. - All descriptors are always up-to-date in memory even after a hang, because CP DMA doesn't serve as a middle man to update them. This should bring: - better hang recovery (descriptors are always up-to-date) - better GPU performance (no KCACHE and TC flushes) - worse CPU performance for partial updates (only whole arrays are uploaded) - less used IB space (no CP_DMA and WRITE_DATA packets) - simpler code - hopefully, some of the corruption issues with SI cards will go away. If not, we'll know the issue is not here. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-07-31 16:49:16 +02:00
Marek Olšák	3344699243	radeonsi: set VGT_LS_HS_CONFIG for tessellation Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-07-23 00:59:33 +02:00
Marek Olšák	74c1001d13	radeonsi: add derived tessellation state Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-07-23 00:59:33 +02:00
Marek Olšák	db267a04ce	radeonsi: implement a fixed-function tessellation control shader and its state Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-07-23 00:59:32 +02:00
Marek Olšák	b6f4fdf6a9	radeonsi: set up a ring buffer for tessellation factors Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-07-23 00:59:32 +02:00
Marek Olšák	59b3556f4c	radeonsi: program VGT_SHADER_STAGES_EN for tessellation Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-07-23 00:59:32 +02:00
Marek Olšák	d1f43a7e5b	radeonsi: add code for creating, binding and destroying tessellation shaders This doesn't do anything yet. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-07-23 00:59:31 +02:00
Marek Olšák	3ce91c727f	radeonsi: rework how shader pointers to descriptors are set This is mainly needed for tessellation where a VS can be bound as VS, ES, or LS, and TES (tess. evaluationshader) can be bound as VS or ES or neither. Therefore we need the ability to move pointers to descriptors between shaders arbitrarily. The idea is that the context has a mapping from PIPE_SHADER_x to SPI_SHADER_USER_DATA_x. After a shader is enabled or disabled, si_shader_change_notify should be called to update this mapping accordingly. There is a dirty flag for each shader pointer, but only one emit function for all pointers in the whole context, whose code and logic is separated from descriptors. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-07-23 00:59:31 +02:00
Ilia Mirkin	a2a1a5805f	gallium: replace INLINE with inline Generated by running: git grep -l INLINE src/gallium/ \| xargs sed -i 's/\bINLINE\b/inline/g' git grep -l INLINE src/mesa/state_tracker/ \| xargs sed -i 's/\bINLINE\b/inline/g' git checkout src/gallium/state_trackers/clover/Doxyfile and manual edits to src/gallium/include/pipe/p_compiler.h src/gallium/README.portability to remove mentions of the inline define. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Acked-by: Marek Olšák <marek.olsak@amd.com>	2015-07-21 17:52:16 -04:00
Marek Olšák	f1be3d8cdd	radeonsi: don't flush an empty IB if the only thing we need is a fence Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-07-05 15:08:59 +02:00
Michel Dänzer	56e38edc96	radeonsi: Add CIK SDMA support Based on the corresponding SI support. Same as that, this is currently only enabled for one-dimensional buffer copies due to issues with multi-dimensional SDMA copies. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>	2015-06-08 18:13:22 +09:00

1 2 3

124 Commits