KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Samuel Pitoiset	5d6a560a29	radv: do not use the availability bit for timestamp queries It's unnecessary because we can just check if the timestamp is to different to the default value when a pool is created or resetted. Instead of waiting for the availability bit to be 1, we have to emit a not equal WAIT_REG_MEM for checking if the timestamp is ready. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-09-28 09:08:03 +02:00
Bas Nieuwenhuizen	fbcd167314	radv: Add on-demand compilation of built-in shaders. In environments where we cannot cache, e.g. Android (no homedir), ChromeOS (readonly rootfs) or sandboxes (cannot open cache), the startup cost of creating a device in radv is rather high, due to compiling all possible built-in pipelines up front. This meant depending on the CPU a 1-4 sec cost of creating a Device. For CTS this cost is unacceptable, and likely for starting random apps too. So if there is no cache, with this patch radv will compile shaders on demand. Once there is a cache from the first run, even if incomplete, the driver knows that it can likely write the cache and precompiles everything. Note that I did not switch the buffer and itob/btoi compute pipelines to on-demand, since you cannot really do anything in Vulkan without them and there are only a few. This reduces the CTS runtime for the no caches scenario on my threadripper from 32 minutes to 8 minutes. Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-08-14 10:26:24 +02:00
Karol Herbst	cb65246ed2	nir: cleanup oversized arrays in nir_swizzle calls There are no fixed sized array arguments in C, those are simply pointers to unsized arrays and as the size is passed in anyway, just rely on that. where possible calls are replaced by nir_channel and nir_channels. Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Karol Herbst <kherbst@redhat.com>	2018-07-13 15:46:57 +02:00
Samuel Pitoiset	6248fbe5e4	radv: get rid of buffer object priorities We mostly use the same priority for all buffer objects, so I don't think that matter much. This should reduce CPU overhead a little bit. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 11:08:40 +02:00
Samuel Pitoiset	1f616a840e	radv: emit a dummy ZPASS_DONE to prevent GPU hangs on GFX9 A ZPASS_DONE or PIXEL_STAT_DUMP_EVENT (of the DB occlusion counters) must immediately precede every timestamp event to prevent a GPU hang on GFX9. Cc: 18.1 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-07-12 10:22:36 +02:00
Samuel Pitoiset	9c09e7d66e	radv: remove unused 'predicated' parameter from some functions It's always false. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2018-06-27 09:48:15 +02:00
Samuel Pitoiset	fa42fa1a60	radv: emit PIPELINESTAT_{START,STOP} events for pipeline stats queries Ported from RadeonSI. This appears to fix some random fails with: dEQP-VK.query_pool.statistics_query.* Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-26 18:23:16 +02:00
Samuel Pitoiset	41f6096c26	radv: use EOP_DATA_SEL_* instead of magic numbers Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-21 10:31:02 +02:00
Marek Olšák	6703fec58c	amd,radeonsi: rename radeon_winsys_cs -> radeon_cmdbuf Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-19 13:08:50 -04:00
Bas Nieuwenhuizen	38933c1151	radv: Add option to print errors even in optimized builds. Errors are not that common of a case so we can eat a slight perf hit in having to call a function and do a runtime check. In turn this makes debugging random errors happening for end users easier, because they don't have to have a debug build on hand. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-31 11:51:23 +02:00
Bas Nieuwenhuizen	62f50df7b7	radv: Fix multiview queries. This moves the extra queries to after the main query ended, instead of doing it after the begin and hence doing nesting. We also emit only (view count - 1) extra queries, as the main query is already there for the first view. This fixes the CTS occasionally getting stuck in dEQP-VK.multiview.queries* waiting on results. Fixes: `32b4f3c38d` "radv/query: handle multiview queries properly. (v3)" CC: 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-05-14 16:49:06 +02:00
Samuel Pitoiset	1d766b0196	radv: only disable out-of-order rast for perfect occlusion queries Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-05-02 10:33:22 +02:00
Samuel Pitoiset	7fe586f6fb	radv: only enable PERFECT_ZPASS_COUNTS for precision occlusion queries This unnecessary when the precision bit flag is not set, and this might hurt performance. The Vulkan explains that not setting VK_QUERY_CONTROL_PRECISE_BIT might be more efficient on some implementations. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-04-06 09:07:34 +02:00
Dave Airlie	032014ac01	radv/query: handle multiview timestamp queries. For each view bit we need to emit a timestamp query. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:14 +00:00
Dave Airlie	32b4f3c38d	radv/query: handle multiview queries properly. (v3) For multiview we need to emit a number of sequential queries depending on the view mask. This avoids dEQP-VK.multiview.queries.15 waiting forever on the CPU for query results that are never coming. We only really want to emit one query, and the rest should be blank (amdvlk does the same), so we emit begin/end pairs for all the others except the first query. v2: fix tests v3: split out patch. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:09 +00:00
Dave Airlie	4034dc5c72	radv/query: split out begin/end query emission This just splits out the begin/end query hw emissions, it makes it easier to add multiview support for queries. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-03-19 19:29:05 +00:00
Samuel Pitoiset	c27f5419f6	radv: only emit cache flushes when the pool size is large enough This is an optimization which reduces the number of flushes for small pool buffers. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:40 +01:00
Samuel Pitoiset	2fe07933bd	radv: keep track of the query pool size Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:53:39 +01:00
Samuel Pitoiset	c956d0f406	radv: make sure to emit cache flushes before starting a query If the query pool has been previously resetted using the compute shader path. Fixes: `a41e2e9cf5` ("radv: allow to use a compute shader for resetting the query pool") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105292 Cc: "18.0" <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-01 09:14:49 +01:00
Dave Airlie	ec1edd0fd2	radv: fix pipeline statistics end query on compute queue It's legal to a pipeline stat query on a compute queue, but we'd emit the wrong packet here. This should fix it to emit the correct packet. Noticed while inspecting the mpv hang. Fixes: `ad61eac250` (radv: factor out eop event writing code. (v2)) Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-12-28 19:31:01 +10:00
Samuel Pitoiset	bc92ed04ac	radv: do not add the query pool BO to the list in vkCmdEndQuery() As per the spec, the query identified by queryPool and query must currently be active. Applications have to call vkCmdBeginQuery() before, and thus the query pool BO will already be in the list. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-11-20 11:18:20 +01:00
Samuel Pitoiset	cd64a4f705	radv: use vk_error() everywhere an error is returned For consistency and it might help for debugging purposes. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-11-13 11:05:26 +01:00
Dave Airlie	25660499b6	radv: wrap cs_add_buffer in an inline. (v2) The next patch will try and avoid calling the indirect function. v2: add a missing conversion. Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-06 21:45:59 +00:00
Samuel Pitoiset	a41e2e9cf5	radv: allow to use a compute shader for resetting the query pool Serious Sam Fusion 2017 uses a huge number of occlusion queries, and the allocated query pool buffer is greater than 4096 bytes. This slightly improves performance (tested in Ultra) from 117.2 FPS to 119.7 FPS (~+2%) on my RX480. This also improves Talos, from 69 FPS to 72/73 FPS (~+5%). Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-27 13:47:06 +02:00
Dave Airlie	a639d40f13	radv: add support for local bos. (v3) This uses the new kernel interfaces for reduced cs overhead, We only set the local flag for memory allocations that don't have a dedicated allocation and ones that aren't imports. v2: add to all the internal buffer creation paths. v3: missed some command submission paths, handle 0/empty bo lists. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 23:59:28 +01:00
Marek Olšák	7b697c8b78	amd: move r600d_common.h into r600g Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-09 16:27:06 +02:00
Samuel Pitoiset	c8ea55ddda	radv: convert all COMPUTE operations to the RADV_META_SAVE_XXX flags Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-06 09:49:06 +02:00
Samuel Pitoiset	457306fa4c	radv: do not need to double zero-init the meta state structures Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-02 11:56:20 +02:00
Samuel Pitoiset	8860b39d94	radv: store the amount of saved constants in the compute state It's safer and more elegant. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-09-27 09:26:44 +02:00
Samuel Pitoiset	4b407a62c7	radv: fix saved compute state when doing statistics/occlusion queries We are pushing 16-bytes of constants, so we have to save/restore the same amount of data to avoid data corruption. Cc: 17.2 <mesa-stable@lists.freedesktop.org> Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-09-26 23:14:48 +02:00
Bas Nieuwenhuizen	d235ff6e8f	radv: Don't use a virtual function for getting the buffer virtual address. We are really not going to use a winsys which does not need to store the va, so might as well store it in a standard field. Not sure this helps perf much though, as most of the cost is in the cache miss accessing the bo anyway, which we stil need to do. Reviewed-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2017-09-20 22:04:25 +02:00
Dave Airlie	a6c2001ace	radv: add support for cmd predication. This doesn't get used yet, it just adds support to various PKT3 emissions to enable it later. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-07-06 02:06:49 +01:00
Bas Nieuwenhuizen	59c2e2a061	radv: Remove SI num RB override for occlusion queries. radeonsi doesn't have it anymore either. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Fixes: `f4e499ec79` "radv: add initial non-conformant radv vulkan driver" Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-06-06 23:23:43 +02:00
Nicolai Hähnle	388d36dfd1	radv: fewer than 8 RBs are possible This fixes the subsequent assertion on Bonaire. Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-06-05 10:43:47 +10:00
Dave Airlie	ad61eac250	radv: factor out eop event writing code. (v2) In prep for GFX9 refactor some of the eop event writing code out. This changes behaviour, but aligns with what radeonsi does, it does double emits on CIK/VI, whereas previously it only did this on CIK. v2: bump the size checks. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-02 12:48:56 +10:00
Dave Airlie	7205431e73	radv: factor out si_emit_wait_fence code. This code was in a few places, consolidate into one. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-06-02 12:48:20 +10:00
Grazvydas Ignotas	45ccb661d8	radv: always free nir shaders from modules on stack valgrind reports them as leaked, and I could not find anything making a copy of the nir pointer. Also, radv_device_init_meta_blit_color() is already freeing them unconditionally like this. Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-05-10 01:13:44 +03:00
Jason Ekstrand	b86dba8a0e	nir: Embed the shader_info in the nir_shader again Commit `e1af20f18a` changed the shader_info from being embedded into being just a pointer. The idea was that sharing the shader_info between NIR and GLSL would be easier if it were a pointer pointing to the same shader_info struct. This, however, has caused a few problems: 1) There are many things which generate NIR without GLSL. This means we have to support both NIR shaders which come from GLSL and ones that don't and need to have an info elsewhere. 2) The solution to (1) raises all sorts of ownership issues which have to be resolved with ralloc_parent checks. 3) Ever since `00620782c9`, we've been using nir_gather_info to fill out the final shader_info. Thanks to cloning and the above ownership issues, the nir_shader::info may not point back to the gl_shader anymore and so we have to do a copy of the shader_info from NIR back to GLSL anyway. All of these issues go away if we just embed the shader_info in the nir_shader. There's a little downside of having to copy it back after calling nir_gather_info but, as explained above, we have to do that anyway. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2017-05-09 15:07:47 -07:00
Dave Airlie	eb2a833679	radv: set base/ranges for push constant loads. This isn't necessary yet but I'd like to use the range in some future patches. [airlied: add new resolve pass] Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-05-08 08:56:36 +10:00
Bas Nieuwenhuizen	6681ab1f97	radv: Use correct stage for ready bit. Set the bit in the same stage as the timestamp, instead always at top of pipe. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Tested-by: Grazvydas Ignotas <notasas@gmail.com>	2017-05-02 00:54:44 +02:00
Bas Nieuwenhuizen	568aec29d9	radv: Add top of pipe timestamp queries. Does not fix brokenness with the ready bit. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-05-02 00:54:18 +02:00
Fredrik Höglund	5ab5d1bee4	radv: use push descriptors in meta Use push descriptors instead of temp descriptor sets. Signed-off-by: Fredrik Höglund <fredrik@kde.org> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-04-14 23:21:24 +02:00
Bas Nieuwenhuizen	b35b5951fc	radv: Stop shadowing the result in radv_GetQueryPoolResults. The outer result was referred to, which meant bugs. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-12 07:38:58 +02:00
Bas Nieuwenhuizen	0763453291	radv: Return VK_NOT_READY if the query results are not available. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Fixes: `8475a14302` ("radv: Implement pipeline statistics queries.") Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2017-04-12 07:38:58 +02:00
Bas Nieuwenhuizen	2dacb727c2	radv: Set query availability bit even if we don't wait. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Fixes: `8475a14302` ("radv: Implement pipeline statistics queries.") Reviewed-by: Fredrik Höglund <fredrik@kde.org>	2017-04-12 07:38:58 +02:00
Bas Nieuwenhuizen	8475a14302	radv: Implement pipeline statistics queries. The devil is in the shader again, otherwise this is fairly straightforward. The CTS contains no pipeline statistics copy to buffer testcases, so I did a basic smoketest. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	d2906bc72d	radv: Let count be dynamic in radv_break_on_count. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	8473193760	radv: Rename query pipeline/set layout. For using them with both occlusion and pipeline statistics queries. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	95743d5b88	radv: Use VK_WHOLE_SIZE for the query buffer bindings. The buffer sizes are specified just a few lines earlier, so don't repeat ourselves. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00
Bas Nieuwenhuizen	8911dd6d12	radv: Use a shader for occlusion CmdCopyQueryPoolResults. Use the new occlusion query copy shader. We don't use the shader for the waiting as a polling loop ineracts badly with having caching enabled. I noticed on my GPU (Tonga) that the values are written out in order, so I just use a WAIT_REG_MEM on the last value. If it turns out other chips don't do that we may need to look a bit more into this. Having 8 WAIT_REG_MEM packets per query doesn't sound ideal. This also restricts the availability word in the pool to timestamp queries only, as occlusion queries don't use it, and pipeline statistic queries likely won't either. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-04-11 09:33:17 +02:00

1 2

60 Commits