KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	965c6445ad	winsys/amdgpu,radeonsi: add HUD counters for how much memory is wasted by slabs Slabs always allocate the next power of two size from their pools. This wastes memory if the size is not a power of two. bo->base.size is overwritten because the default is the allocated power of two size, but we need the real size to compute the wasted size in amdgpu_bo_slab_destroy. entry_size is added to the hole in pb_slab_entry to hold the real entry size. Like other memory stats, no atomics are used. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8683>	2021-02-03 21:53:33 +00:00
Marek Olšák	2c61411f25	winsys/amdgpu: don't use debug_get_option_noop in a hot path Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7721>	2020-12-01 15:33:03 -05:00
Pierre-Eric Pelloux-Prayer	111a1b2e1c	winsys/amdgpu: make RADEON_ALL_BOS a debug only feature Improves performance in SPECviewperf13 snx. e.g.: test10 fps evolution: 235 -> 270. Extract from "pahole radeonsi_dri.so -C amdgpu_winsys_bo", before: struct amdgpu_winsys_bo { struct pb_buffer base; /* 0 32 / union { struct { struct pb_cache_entry cache_entry; / 32 56 / / XXX last struct has 4 bytes of padding / / --- cacheline 1 boundary (64 bytes) was 24 bytes ago --- / amdgpu_va_handle va_handle; / 88 8 / int map_count; / 96 4 / _Bool use_reusable_pool; / 100 1 / / XXX 3 bytes hole, try to pack / struct list_head global_list_item; / 104 16 / uint32_t kms_handle; / 120 4 / } real; [...] } u; / 32 96 / [...] / size: 200, cachelines: 4, members: 15 / }; After: struct amdgpu_winsys_bo { struct pb_buffer base; / 0 32 / union { struct { struct pb_cache_entry cache_entry; / 32 56 / / XXX last struct has 4 bytes of padding / / --- cacheline 1 boundary (64 bytes) was 24 bytes ago --- / amdgpu_va_handle va_handle; / 88 8 / int map_count; / 96 4 / _Bool use_reusable_pool; / 100 1 / / XXX 3 bytes hole, try to pack / uint32_t kms_handle; / 104 4 / } real; / 32 80 / } u; / 32 80 / / --- cacheline 1 boundary (64 bytes) was 48 bytes ago --- / [...] / size: 184, cachelines: 3, members: 15 */ }; Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7532>	2020-11-19 12:44:40 +00:00
Pierre-Eric Pelloux-Prayer	90b98c0649	amd/tmz: move uses_secure_bos to radeon_winsys This allows to inline radeon_uses_secure_bos calls and reduce CPU overhead. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049>	2020-09-24 14:51:16 +00:00
Pierre-Eric Pelloux-Prayer	1b0d660cbc	radeonsi/tmz: allow secure job if the app made a tmz allocation This commit makes TMZ always allowed instead of being either off or forced-on with AMD_DEBUG=tmz. With this change: - secure job can be used as soon as the application made a tmz allocation. Driver internal allocations are not enough to enable secure jobs (if tmz is supported and enabled by the kernel) - AMD_DEBUG=tmz forces all scanout/depth/stencil buffers to be allocated as TMZ. This is useful to test app thats don't explicitely support protected content. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6049>	2020-09-24 14:51:16 +00:00
Marek Olšák	4cf674c8f7	ac/surface: add a wrapper structure to hold ADDR_HANDLE and more things in the future. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5398>	2020-06-10 15:35:46 +00:00
Pierre-Eric Pelloux-Prayer	fe2a3b804b	amdgpu: add encrypted slabs support Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>	2020-05-11 10:25:53 +02:00
Pierre-Eric Pelloux-Prayer	977e19d5cf	amdgpu/radeon: add secure api Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4401>	2020-05-11 10:25:53 +02:00
Marek Olšák	502840855a	gallium/hash_table: turn it into a wrapper around util/hash_table Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3722>	2020-02-26 20:35:50 +00:00
Michel Dänzer	9f2bed49d4	winsys/amdgpu: Re-use amdgpu_screen_winsys when possible Namely, if os_same_file_description determined that the DRM file descriptor references the same file description. v2: * Adapt to amdgpu_winsys::sws_list_lock. v3: * Fix comparison of amdgpu_screen_winsys file descriptions, see https://gitlab.freedesktop.org/mesa/mesa/issues/2413 . * Lock amdgpu_winsys::sws_list_lock for traversing the sws_list in amdgpu_winsys_create. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3582>	2020-01-29 15:51:01 +00:00
Marek Olšák	0c154d9e2d	Revert "winsys/amdgpu: Re-use amdgpu_screen_winsys when possible" This reverts commit `b60f5cbc15`. This fixes dmesg errors and X freezes: [ 29.543096] amdgpu 0000:0c:00.0: No GEM object associated to handle 0x00000009, can't create framebuffer [ 29.543103] amdgpu 0000:0c:00.0: No GEM object associated to handle 0x00000009, can't create framebuffer	2020-01-27 17:48:42 -05:00
Michel Dänzer	b60f5cbc15	winsys/amdgpu: Re-use amdgpu_screen_winsys when possible Namely, if os_same_file_description determined that the DRM file descriptor references the same file description. v2: * Adapt to amdgpu_winsys::sws_list_lock. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>	2020-01-23 17:39:34 +01:00
Michel Dänzer	c6468f66c7	winsys/amdgpu: Only re-export KMS handles for different DRM FDs When the amdgpu_screen_winsys uses the same FD as the amdgpu_winsys (which is always the case for the first amdgpu_screen_winsys), we can just use bo->u.real.kms_handle. v2: * Also only create the kms_handles hash table if the amdgpu_screen_winsys fd is different from the amdgpu_winsys one. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>	2020-01-23 17:39:34 +01:00
Michel Dänzer	24075ac60f	winsys/amdgpu: Keep track of retrieved KMS handles using hash tables The assumption being that KMS handles are only retrieved for relatively few BOs, so hash tables should be efficient both in terms of performance and memory consumption. We use the address of struct amdgpu_winsys_bo as the key and its kms_handle field (the KMS handle valid for the DRM file descriptor passed to amdgpu_device_initialize) as the hash value. v2: * Add comment above amdgpu_screen_winsys::kms_handles (Pierre-Eric Pelloux-Prayer) v3: * Protect kms_handles hash table with amdgpu_winsys::sws_list_lock mutex. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>	2020-01-23 17:24:00 +01:00
Michel Dänzer	f4010a6da9	winsys/amdgpu: Keep a list of amdgpu_screen_winsyses in amdgpu_winsys v2: * Add dedicated mutex for the list. Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3202>	2020-01-23 17:23:32 +01:00
Michel Dänzer	cb446dc0fa	winsys/amdgpu: Add amdgpu_screen_winsys It extends pipe_screen / radeon_winsys and references amdgpu_winsys. Multiple amdgpu_screen_winsys instances may reference the same amdgpu_winsys instance, which corresponds to an amdgpu_device_handle. The purpose of amdgpu_screen_winsys is to keep a duplicate of the DRM file descriptor passed to amdgpu_winsys_create, which will be needed in the next change. v2: * Add comment in amdgpu_winsys_unref explaining why it always returns true (Marek Olšák) Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>	2019-07-03 09:19:07 +00:00
Nicolai Hähnle	776b911365	amd/addrlib: update Mesa's copy of addrlib Update to the internal master as of 2018-11-15. This has a lot of gratuitous whitespace change, but on the plus side it's built using the same tooling that's used for AMDVLK, which should help going forward.	2018-11-29 13:18:24 +01:00
Marek Olšák	5f9ccf827e	winsys/amdgpu: optimize slab allocation for 2 MB amdgpu page tables - the slab buffer size increased from 128 KB to 2 MB (PTE fragment size) - the max suballocated buffer size increased from 64 KB to 256 KB, this increases memory usage because it wastes memory - the number of suballocators increased from 1 to 3 and they are layered on top of each other to minimize unused space in slabs The final increase in memory usage is: DeusEx:MD: 1.8% DOTA 2: 1.75% DiRT Rally: 0.2% The kernel driver will also receive fewer buffers.	2018-11-28 20:20:27 -05:00
Marek Olšák	cf6835485c	radeonsi: generalize the slab allocator code to allow layered slab allocators There is no change in behavior. It just makes it easier to change the number of slab allocators.	2018-11-28 20:20:27 -05:00
Marek Olšák	51d6b163da	winsys/amdgpu: fix VDPAU interop by having one amdgpu_winsys_bo per BO (v2) Dependencies between rings are inserted correctly if a buffer is represented by only one unique amdgpu_winsys_bo instance. Use a hash table keyed by amdgpu_bo_handle to have exactly one amdgpu_winsys_bo per amdgpu_bo_handle. v2: return offset and stride properly Tested-by: Leo Liu <leo.liu@amd.com> Acked-by: Leo Liu <leo.liu@amd.com>	2018-07-18 11:56:28 -04:00
Grazvydas Ignotas	f966929805	radeonsi: add a debug flag to zero vram allocations This allows to avoid having to see garbage in Dying Light loading screen at least, which probably expects Windows/NV behavior of all allocations being zeroed by default. Analogous to radv flag with the same name. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-06-21 12:18:50 +03:00
Timothy Arceri	87f02ddfd1	amdgpu: use simple mtx Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-09 12:07:48 +11:00
Andrey Grodzovsky	19fc3cdcfb	winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx. Fixes reverted patch `f03b7c9` by doing VMID reservation per process and not per context. Also updates required amdgpu libdrm version since the change involved interface updates in amdgpu libdrm. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-11-03 18:06:17 +01:00
Marek Olšák	529cdce799	radeonsi: remove 'Authors:' comments It's inaccurate. Instead, see the copyright and use "git log" and "git blame" to know the authorship. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-02 18:19:03 +01:00
Marek Olšák	1f2640bfa9	Revert "winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx." This reverts commit `f03b7c9ad9`. The libdrm interface is wrong.	2017-11-01 21:42:31 +01:00
Andrey Grodzovsky	f03b7c9ad9	winsys/amdgpu: Add R600_DEBUG flag to reserve VMID per ctx. Signed-off-by: Andrey Grodzovsky <andrey.grodzovsky@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2017-10-31 16:55:24 +01:00
Marek Olšák	0aafedbbb2	radeonsi: add GFX-IB-size query to the HUD It shows the sum of all IBs per frame. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Samuel Pitoiset	0d9117b7bd	winsys/amdgpu: add BO to the global list only when RADEON_ALL_BOS is set Only useful when that debug option is enabled. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-30 09:33:59 +02:00
Marek Olšák	4a758a17da	winsys/amdgpu: enable computation of tile swizzle Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-04 02:10:04 +02:00
Marek Olšák	0591df025b	winsys/amdgpu: use 128KB BOs for suballocations of up to 64KB BOs This decreases the number of BOs, but might also increase memory usage. It's better for small textures. The gameplay is on the far right: https://people.freedesktop.org/~mareko/suballoc.svg Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-07-04 15:40:37 +02:00
Marek Olšák	5b373629fc	radeonsi: add a HUD query for getting an average GFX BO list size Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-07-04 15:40:37 +02:00
Nicolai Hähnle	f187a49322	ac/radeonsi: move amdgpu_addr_create to ac_surface v2: - update Android.common.mk (Emil) - rebase on top of Raven support Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)	2017-05-18 11:48:51 +02:00
Timothy Arceri	2efddc63ee	gallium/util: replace pipe_mutex with mtx_t pipe_mutex was made unnecessary with `fd33a6bcd7`. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-03-07 08:48:11 +11:00
Samuel Pitoiset	cff199ceb7	gallium/radeon: add a new HUD query for the number of mapped buffers Useful when debugging applications which map a ton of buffers and also because we used to run into Linux's limit on the number of simultaneous mmap() calls. v2: - update the commit message Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-25 15:19:21 +01:00
Marek Olšák	1840800860	winsys/amdgpu: report a rejected IB as a lost context Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-23 23:43:38 +01:00
Marek Olšák	2b621c47aa	gallium/radeon: add new HUD query num-SDMA-IBs Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	6b8a371e00	gallium/radeon: rename the num-ctx-flushes query to num-GFX-IBs Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Nicolai Hähnle	ffa1c669dd	winsys/amdgpu: enable buffer allocation from slabs Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-09-27 16:45:23 +02:00
Marek Olšák	1e04483c22	winsys/amdgpu: track the amount of mapped memory Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-10 01:11:10 +02:00
Nicolai Hähnle	49c0b4a0db	winsys/amdgpu: add guard pages when R600_DEBUG=check_vm is enabled This should help flush out GPU VM faults. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Marek Olšák	562cb03d76	gallium/util: import the multithreaded job queue from amdgpu winsys (v2) v2: rename the event to util_queue_fence Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-15 21:07:34 +02:00
Marek Olšák	53f33619a4	winsys/amdgpu: add back multithreaded command submission Ported from the initial amdgpu winsys from the private AMD branch. The thread creates the buffer list, submits IBs, and cleans up the submission context, which can also destroy buffers. 3-5% reduction in CPU overhead is expected for apps submitting a lot of IBs per frame. This is most visible with DMA IBs. v2: use a semaphore instead of a busy loop in amdgpu_ws_queue_cs add another amdgpu_cs_sync_flush call into amdgpu_bo_map Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-26 16:43:45 +02:00
Marek Olšák	9d8c283f28	winsys/amdgpu: move gart_page_size to struct radeon_winsys Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-05-10 17:20:09 +02:00
Marek Olšák	e707b9d8ba	winsys/amdgpu: optionally use buffer lists with all allocated buffers Set RADEON_ALL_BOS=1 to use it. Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-01-23 17:01:54 +01:00
Marek Olšák	6f4e74d165	winsys/amdgpu: use pb_cache instead of pb_cache_manager This is a prerequisite for the removal of radeon_winsys_cs_handle. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <eocallaghan@alterapraxis.com> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>	2015-12-11 15:25:12 +01:00
Marek Olšák	e7fc664b91	winsys/amdgpu: add addrlib - texture addressing and alignment calculator This is an internal project that Catalyst uses and now open source will do too. v2: squashed these commits in: - winsys/amdgpu: fix warnings in addrlib - winsys/amdgpu: set PIPE_CONFIG and NUM_BANKS in tiling_flags	2015-08-14 15:02:28 +02:00
Marek Olšák	2eb067db0f	winsys/amdgpu: add a new winsys for the new kernel driver v2: - lots of changes according to Emil Velikov's comments - implemented radeon_winsys::read_registers v3: - a lot of new work, many of them adapt to libdrm interface changes Squashed patches: winsys/amdgpu: implement radeon_winsys context support winsys/amdgpu: add reference counting for contexts winsys/amdgpu: add userptr support winsys/amdgpu: allocate IBs like normal buffers winsys/amdgpu: add IBs to the buffer list, adapt to interface changes winsys/amdgpu: don't use KMS handles as reloc hash keys winsys/amdgpu: sync buffer accesses to different rings winsys/amdgpu: use dependencies instead of waiting for last fence v2 gallium/radeon: unify buffer_wait and buffer_is_busy in the winsys interface (amdgpu part) winsys/amdgpu: track fences per ring and be thread-safe winsys/amdgpu: simplify waiting on a variable in amdgpu_fence_wait gallium/radeon: allow the winsys to choose the IB size (amdgpu part) winsys/amdgpu: switch to new amdgpu_cs_query_fence_status interface winsys/amdgpu: handle fence and dependencies merge winsys/amdgpu follow libdrm change to move user fence into UMD winsys/amdgpu: use amdgpu_bo_va_op for va map/unmap v2 winsys/amdgpu: use the new tiling flags winsys/amdgpu: switch to new GTT_USWC definition winsys/amdgpu: expose amdgpu_cs_query_reset_state to drivers winsys/amdgpu: fix valgrind warnings winsys/amdgpu: don't use VRAM with APUs that don't have much of it winsys/amdgpu: require LLVM 3.6.1 for VI because of bug fixes there winsys/amdgpu: remove amdgpu_winsys::num_cpus winsys/amdgpu: align BO size to page size winsys/amdgpu: reduce BO cache timeout winsys/amdgpu: remove useless flushing and waiting in amdgpu_bo_set_tiling winsys/amdgpu: use amdgpu_device_handle as a unique device ID instead of fd winsys/amdgpu: use safer access to amdgpu_fence_wait::signalled winsys/amdgpu: allow maximum IB size of 4 MB winsys/amdgpu: add ip_instance into amdgpu_fence gallium/radeon: add RING_COMPUTE instead of RADEON_FLUSH_COMPUTE winsys/amdgpu: set the ring type at CS initilization winsys/amdgpu: query the GART page size from the kernel winsys/amdgpu: correctly wait for shared buffers to become idle winsys/amdgpu: set the amdgpu_cs_fence structure only once at fence creation winsys/amdgpu: add a specific error message for cs_submit -> -ENOMEM winsys/amdgpu: check num_active_ioctls before calling amdgpu_bo_wait_for_idle winsys/amdgpu: clear user fence BO after allocating it winsys/amdgpu: fix user fences winsys/amdgpu: make amdgpu_winsys_create public winsys/amdgpu: remove thread offloading winsys/amdgpu: flatten the amdgpu_cs_context structure and simplify more v4: require libdrm 2.4.63	2015-08-14 15:02:28 +02:00

47 Commits