KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Nicolai Hähnle	83a01cb498	winsys/amdgpu: start with smaller IBs, growing as necessary This avoids allocating giant IBs from the outset, especially for CE and DMA. Since we now limit max_dw only by the size that the buffer happens to be (which, due to the buffer cache, can be even larger than the rounded-up size we request), the new function amdgpu_ib_max_submit_dwords controls when we submit an IB. With this change, we effectively never flush prematurely due to the CE IB, after an initial warm-up phase. v2: - clean up buffer_size calculation Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	f80c6abb9e	winsys/amdgpu: add amdgpu_ib and amdgpu_cs_from_ib helper functions The latter function allows getting the containing amdgpu_cs from any IB (including non-main ones). Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:19 +02:00
Nicolai Hähnle	92d5d97b10	winsys/amdgpu: simplify interface of amdgpu_get_new_ib We'll want to have an amdgpu_cs pointer for future changes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-01 22:52:18 +02:00
Marek Olšák	53f33619a4	winsys/amdgpu: add back multithreaded command submission Ported from the initial amdgpu winsys from the private AMD branch. The thread creates the buffer list, submits IBs, and cleans up the submission context, which can also destroy buffers. 3-5% reduction in CPU overhead is expected for apps submitting a lot of IBs per frame. This is most visible with DMA IBs. v2: use a semaphore instead of a busy loop in amdgpu_ws_queue_cs add another amdgpu_cs_sync_flush call into amdgpu_bo_map Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-26 16:43:45 +02:00
Marek Olšák	7997b5f005	winsys/amdgpu: Add support for const IB. v2: Use the correct IB to update request (Bas Nieuwenhuizen) v3: Add preamble IB. (Bas Nieuwenhuizen) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-04-19 18:10:30 +02:00
Marek Olšák	e78170f388	winsys/amdgpu: split IB data into a new structure in preparation for CE Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-19 18:10:30 +02:00
Marek Olšák	f4b77c764a	gallium/radeon: move ring_type into winsyses Not used by drivers. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2016-04-19 18:10:30 +02:00
Bas Nieuwenhuizen	6373845d98	winsys/amdgpu: enlarge buffer_indices_hashlist Enlarge the buffer hashlist to prevent large numbers of misses due to adding more buffers than can be cached in the hashlist. The game I tested had CS's with up to 1500 buffers and the overhead of amdgpu_lookup_buffer for various sizes was: 4096 1.97% (new value) 2048 4.37% 1024 6.92% 512 9.47% (old value) (percentage of CPU usage in render thread as determined by perf) The time spent in amdgpu_add_buffer self is ~4.2% in all cases and for 4096 the time needed to clear the hashlist is still < 0.10%, so I am not expecting significant regressions. Signed-off-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2016-03-09 00:52:07 +01:00
Marek Olšák	1e05812fcd	winsys/amdgpu: don't use the "rws" abbreviation for amdgpu_winsys Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	6f48e2bee1	winsys/amdgpu: add winsys function cs_get_buffer_list For debugging. Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:07 +02:00
Marek Olšák	93641f4341	gallium/radeon: stop using "reloc" in a few places Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-10-03 22:06:07 +02:00
Marek Olšák	5fb0180592	winsys/amdgpu: fix the type of memory usage counters If the 32-bit types overflowed, the driver could submit an IB that uses much more memory than is available. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-08-19 12:03:01 +02:00
Marek Olšák	2eb067db0f	winsys/amdgpu: add a new winsys for the new kernel driver v2: - lots of changes according to Emil Velikov's comments - implemented radeon_winsys::read_registers v3: - a lot of new work, many of them adapt to libdrm interface changes Squashed patches: winsys/amdgpu: implement radeon_winsys context support winsys/amdgpu: add reference counting for contexts winsys/amdgpu: add userptr support winsys/amdgpu: allocate IBs like normal buffers winsys/amdgpu: add IBs to the buffer list, adapt to interface changes winsys/amdgpu: don't use KMS handles as reloc hash keys winsys/amdgpu: sync buffer accesses to different rings winsys/amdgpu: use dependencies instead of waiting for last fence v2 gallium/radeon: unify buffer_wait and buffer_is_busy in the winsys interface (amdgpu part) winsys/amdgpu: track fences per ring and be thread-safe winsys/amdgpu: simplify waiting on a variable in amdgpu_fence_wait gallium/radeon: allow the winsys to choose the IB size (amdgpu part) winsys/amdgpu: switch to new amdgpu_cs_query_fence_status interface winsys/amdgpu: handle fence and dependencies merge winsys/amdgpu follow libdrm change to move user fence into UMD winsys/amdgpu: use amdgpu_bo_va_op for va map/unmap v2 winsys/amdgpu: use the new tiling flags winsys/amdgpu: switch to new GTT_USWC definition winsys/amdgpu: expose amdgpu_cs_query_reset_state to drivers winsys/amdgpu: fix valgrind warnings winsys/amdgpu: don't use VRAM with APUs that don't have much of it winsys/amdgpu: require LLVM 3.6.1 for VI because of bug fixes there winsys/amdgpu: remove amdgpu_winsys::num_cpus winsys/amdgpu: align BO size to page size winsys/amdgpu: reduce BO cache timeout winsys/amdgpu: remove useless flushing and waiting in amdgpu_bo_set_tiling winsys/amdgpu: use amdgpu_device_handle as a unique device ID instead of fd winsys/amdgpu: use safer access to amdgpu_fence_wait::signalled winsys/amdgpu: allow maximum IB size of 4 MB winsys/amdgpu: add ip_instance into amdgpu_fence gallium/radeon: add RING_COMPUTE instead of RADEON_FLUSH_COMPUTE winsys/amdgpu: set the ring type at CS initilization winsys/amdgpu: query the GART page size from the kernel winsys/amdgpu: correctly wait for shared buffers to become idle winsys/amdgpu: set the amdgpu_cs_fence structure only once at fence creation winsys/amdgpu: add a specific error message for cs_submit -> -ENOMEM winsys/amdgpu: check num_active_ioctls before calling amdgpu_bo_wait_for_idle winsys/amdgpu: clear user fence BO after allocating it winsys/amdgpu: fix user fences winsys/amdgpu: make amdgpu_winsys_create public winsys/amdgpu: remove thread offloading winsys/amdgpu: flatten the amdgpu_cs_context structure and simplify more v4: require libdrm 2.4.63	2015-08-14 15:02:28 +02:00

1 2

63 Commits