KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Nicolai Hähnle	125a915052	radeonsi: implement PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE v2: use uncached system memory for the fence, and use the CPU to clear it so we never read garbage when checking the fence Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:55 +01:00
Nicolai Hähnle	e4627ac8fb	radeonsi: document some subtle details of fence_finish & fence_server_sync v2: remove the change to si_fence_server_sync, we'll handle that more robustly Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:50 +01:00
Nicolai Hähnle	609a230375	gallium/u_threaded: implement asynchronous flushes This requires out-of-band creation of fences, and will be signaled to the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag. v2: - remove an incorrect assertion - handle fence_server_sync for unsubmitted fences by relying on the improved cs_add_fence_dependency - only implement asynchronous flushes on amdgpu Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:42 +01:00
Nicolai Hähnle	78a4750d91	radeonsi: move fence functions to si_fence.c Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 14:00:42 +01:00
Nicolai Hähnle	f76a6cb337	radeonsi: always use async compiles when creating shader/compute states With Gallium threaded contexts, creating shader/compute states is effectively a screen operation, so we should not use context state. In particular, this allows us to avoid using the context's LLVM TargetMachine. This isn't an issue yet because u_threaded_context filters out non-async debug callbacks, and we disable threaded contexts for debug contexts. However, we may want to change that in the future. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:53:20 +01:00
Nicolai Hähnle	b650fc09c3	radeonsi: fix potential use-after-free of debug callbacks Found by inspection. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:53:20 +01:00
Nicolai Hähnle	dd7c273e87	radeonsi: move pipe debug callback to si_context Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:53:19 +01:00
Nicolai Hähnle	0f54ee6072	radeonsi: reduce the scope of sel->mutex in si_shader_select_with_key We only need the lock to guard changes in the variant linked list. The actual compilation can happen outside the lock, since we use the ready fence as a guard. v2: fix double-unlock Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:37:51 +01:00
Nicolai Hähnle	4f493c79ee	radeonsi: use ready fences on all shaders, not just optimized ones There's a race condition between si_shader_select_with_key and si_bind_XX_shader: Thread 1 Thread 2 -------- -------- si_shader_select_with_key begin compiling the first variant (guarded by sel->mutex) si_bind_XX_shader select first_variant by default as state->current si_shader_select_with_key match state->current and early-out Since thread 2 never takes sel->mutex, it may go on rendering without a PM4 for that shader, for example. The solution taken by this patch is to broaden the scope of shader->optimized_ready to a fence shader->ready that applies to all shaders. This does not hurt the fast path (if anything it makes it faster, because we don't explicitly check is_optimized). It will also allow reducing the scope of sel->mutex locks, but this is deferred to a later commit for better bisectability. Fixes dEQP-EGL.functional.sharing.gles2.multithread.simple.buffers.bufferdata_render Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:37:51 +01:00
Marek Olšák	33000e7c43	radeonsi: add si_screen::has_ls_vgpr_init_bug Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:58:40 +01:00
Marek Olšák	cde664ab81	radeonsi: use ac_create_target_machine Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:58:38 +01:00
Marek Olšák	81f81fdb54	radeonsi: use ac_get_llvm_processor_name Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:58:36 +01:00
Marek Olšák	c29f5fe41c	radeonsi/gfx9: don't set gs_table_depth Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:58:33 +01:00
Marek Olšák	e616743dab	radeonsi/gfx9: limit the scissor bug workaround to Vega10 and Raven only Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-07 17:26:36 +01:00
Marek Olšák	3f58988b81	radeonsi: enable signed vertex buffer offsets	2017-11-06 19:09:12 +01:00
Marek Olšák	24d6318d24	gallium: add PIPE_CAP_SIGNED_VERTEX_BUFFER_OFFSET	2017-11-06 19:09:12 +01:00
Timothy Arceri	439a2febc4	ac/radeonsi: add support for tex instr without a derefence These are produced by nir_lower_bitmap(), adding the missing derefence would cause other issues that need to be hacked around such as skipping sampler lowering and uniform location assignment, so this change seems the correct way to go. Fixes 194 piglit crashes on radeonsi using NIR. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-03 14:19:51 +11:00
Marek Olšák	529cdce799	radeonsi: remove 'Authors:' comments It's inaccurate. Instead, see the copyright and use "git log" and "git blame" to know the authorship. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-02 18:19:03 +01:00
Dave Airlie	d3fdd66401	gallium: add cap for driver specified max combined shader resources. Some hw (evergreen) has a limit on how many combined (images/buffers/mrts) a fragment shader can access. Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-11-01 10:07:03 +10:00
Timothy Arceri	e80bbd6f52	radeonsi: fix culldist_writemask in nir path The shared si_create_shader_selector() code already offsets the mask. Fixes the following piglit tests: arb_cull_distance/clip-cull-3.shader_test arb_cull_distance/clip-cull-4.shader_test Fixes: `29d7bdd179` (radeonsi: scan NIR shaders to obtain required info) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-01 09:41:11 +11:00
Samuel Pitoiset	dd79aa4ad3	radeonsi: update hack for HTILE corruption in ARK: Survival Evolved It appears that flushing the DB metadata is actually not sufficient since the driver uses the new VS blit shaders. This looks quite strange though, but it seems like we need to flush DB for fixing the corruption. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102955 Fixes: `69ccb9dae7` (radeonsi: use new VS blit shaders (VS inputs in SGPRs) Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-27 10:47:30 +02:00
Marek Olšák	3f8e3c2bd8	radeonsi: add a workaround for weird s_buffer_load_dword behavior on SI See my LLVM patch which fixes the root cause. Users have to apply this patch and then they have 2 choices: - Downgrade to LLVM 5.0 - Update to LLVM git after my LLVM patch is pushed. It won't be possible to use current and earlier development version of LLVM 6.0. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Cc: 17.3 <mesa-stable@lists.freedesktop.org>	2017-10-26 16:44:01 +02:00
Dave Airlie	82d47b9d38	ac/llvm: consolidate find lsb function. This was the same between si and ac. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:31 +10:00
Dave Airlie	f925f5b074	ac/nir: move lds declaration/load/store into shared code. This was duplicated between both drivers, share here. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-26 15:59:11 +10:00
Marek Olšák	2a414c3961	radeonsi: postponed KILL isn't postponed anymore, but maintains WQM This restores performance for the drirc workaround, i.e. KILL_IF does: visible = src0 >= 0; kill_flag &= visible; // accumulate kills amdgcn_kill(wqm_vote(visible)); // kill fully dead quads only And all helper pixels are killed at the end of the shader: amdgcn_kill(kill_flag); Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-24 14:56:34 +02:00
Marek Olšák	da0083f123	radeonsi: use postponed KILL only when derivatives are used Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-24 14:56:34 +02:00
Marek Olšák	1ff9e27cbd	ac: replace ac_build_kill with ac_build_kill_if_false This will be a new LLVM intrinsic and will also work nicely with llvm.amdgcn.wqm.vote. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-24 14:56:34 +02:00
Bas Nieuwenhuizen	a548b727a1	ac/nir: Only clamp shadow reference on radeonsi. Vulkan CTS does not expect the value to be clamped (at least for D32), and it makes a differences even though depth is in [0,1], due to strict inequalities. I couldn't find anything in the Vulkan spec about this, but the test seemed to be copied from GL tests and the GL spec only specifies clamping for fixed point formats. Hence I expect radeonsi to run into this at some point as well, but given that they still have a usecase with the Z16->Z32 promotion, I'll leave that for someone else to clean up. This at least fixes radv dEQP-VK.texture.shadow.* on VI. Fixes: `0f9e32519b` 'ac/nir: clamp shadow texture comparison value on VI' Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-10-23 09:13:38 +02:00
Andres Rodriguez	557de3b9ae	radeonsi: hardcode shader WAVE_LIMIT to the maximum value This is part of a cooperative scheduling approach used by radv. All drivers in the stack must opt-in to resource arbitration, otherwise GL based apps will be able to ignore system priorities. We always hardcode the field to its maximum value, instead of attempting to calculate an approximate usage. In testing, there were no benefits to using anything other than the maximum. Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-10-21 01:01:44 +02:00
Jason Ekstrand	59fb59ad54	nir: Get rid of nir_shader::stage It's redundant with nir_shader::info::stage. Acked-by: Timothy Arceri <tarceri@itsqueeze.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2017-10-20 12:49:17 -07:00
Marek Olšák	2f4705afde	radeonsi: if there's just const buffer 0, set it in place of CONST/SSBO pointer SI_SGPR_CONST_AND_SHADER_BUFFERS now contains the pointer to const buffer 0 if there is no other buffer there. Benefits: - there is no constbuf descriptor upload and shader load It's assumed that all constant addresses are within bounds. Non-constant addresses are clamped against the last declared CONST variable. This only works if the state tracker ensures the bound constant buffer matches what the shader needs. Once we get 32-bit pointers, we can only do this for user constant buffers where the driver is in charge of the upload so that it can guarantee a 32-bit address. The real performance benefit might not be measurable. These apps get 100% theoretical benefit in all shaders (except where noted): - antichamber - barman arkham origins - borderlands 2 - borderlands pre-sequel - brutal legend - civilization BE - CS:GO - deadcore - dota 2 -- most shaders - europa universalis - grid autosport -- most shaders - left 4 dead 2 - legend of grimrock - life is strange - payday 2 - portal - rocket league - serious sam 3 bfe - talos principle - team fortress 2 - thea - unigine heaven - unigine valley -- also sanctuary and tropics - wasteland 2 - xcom: enemy unknown & enemy within - tesseract - unity (engine) Changed stats only: SGPRS: 2059998 -> 2086238 (1.27 %) VGPRS: 1626888 -> 1626904 (0.00 %) Spilled SGPRs: 7902 -> 7865 (-0.47 %) Code Size: 60924520 -> 60982660 (0.10 %) bytes Max Waves: 374539 -> 374526 (-0.00 %) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	854593b8eb	ac: clean up ac_build_indexed_load function interfaces Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	cdb21dfffa	radeonsi: handle 64-bit loads earlier in fetch_constant Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	ee0e1a47ce	radeonsi: add si_descriptors::gpu_address and remove buffer_offset This allows us to change the pointer arbitrarily. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	6d2664880c	radeonsi: unify code for extracting a buffer address from a descriptor Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	8d2685d129	radeonsi: remove atom parameter from si_upload_descriptors Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	4ddce1b1a4	radeonsi: pack si_descriptors better again Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	859eeffb3d	radeonsi: emit dirty consecutive pointers in one SET_SH_REG packet IB size: -1.6% Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	36626ffe46	radeonsi: split si_emit_shader_pointer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	69325fa88d	radeonsi: generalize the SI_VS_SHADER_POINTER_MASK macro Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	79c2e7388c	radeonsi/gfx9: use SPI_SHADER_USER_DATA_COMMON IB size: -0.4% Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Marek Olšák	b762a08896	radeonsi/gfx9: move RW_BUFFERS from s[0:1] to s[8:9] for HS and GS Let's use the same user data SGPRs in all stages. (for SPI_SHADER_USER_DATA_COMMON_0) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-17 22:03:03 +02:00
Dylan Baker	b154b44ae3	meson: build radeonsi gallium driver This hooks up the bits necessary to build gallium dri drivers, with radeonSI as the first example driver. This isn't tested yet. v4: - drop radeonsi generated header from sources. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric at anholt.net>	2017-10-16 16:32:43 -07:00
Dylan Baker	66f97f6640	meson: build radeonsi This builds the radeonsi (and radeon) window system bits and gallium driver bits. Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com> Reviewed-by: Eric Anholt <eric at anholt.net>	2017-10-16 16:32:43 -07:00
Marek Olšák	f536f45250	radeonsi: implement sync_file import/export Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-12 21:07:48 +02:00
Nicolai Hähnle	bc2d874101	radeonsi: add support for PIPE_FORMAT_{X1,A1}R5G5B5_UNORM Fixes dEQP-EGL.functional.image.modify.tex_rgb5_a1_tex_subimage_rgba8 Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-12 08:42:55 +02:00
Dave Airlie	80bbdb1483	radeonsi: lower ffma in nir to mad. This lowers ffma to a * b + c. This seems like it should keep Marek happiest, so we'd never get to the fma instruction emission code. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com>	2017-10-11 07:33:32 +10:00
Eric Anholt	ac0051a507	gallium: Create a new PIPE_CAP_TILE_RASTER_ORDER for vc4. Because vc4 can control the order that tiles are rasterized in, we can use it to implement overlapping blits using normal drawing and GL_ARB_texture_barrier, as long as we can tell the kernel what order to render the tiles in. This commit introduces the core gallium support, vc4 changes will follow. v2: Fix on the simulator. v3: Add the cap (disabled) to other drivers, add rst docs for the cap. v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS v5: Drop vc4 changes from this commit, for clarity. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v3)	2017-10-10 10:45:22 -07:00
Marek Olšák	76997e9133	radeonsi: shrink r600d_common.h and stop using it Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-09 16:27:05 +02:00
Marek Olšák	0ecf9b90ef	radeonsi: import cayman_msaa.c from drivers/radeon Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-09 16:27:04 +02:00

1 2 3 4 5 ...

2960 Commits