KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Marek Olšák	5f99c49008	radeonsi: precompute IA_MULTI_VGT_PARAM values into a table The perf difference is very small: 0.99% -> 0.40% for the time spent in si_get_ia_multi_vgt_param when si_draw_vbo is 20%. Pretty much nothing. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	c78177fc64	radeonsi: move VGT_VERTEX_REUSE_BLOCK_CNTL into shader states for Polaris Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	ccecf79c2b	radeonsi: state atom IDs don't have to be off by one Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	ac059f1c23	radeonsi: use a bitmask for looping over dirty PM4 states also move it to draw_vbo, because it should be 0 in most cases Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	802fcdc0d2	radeonsi: atomize L2 prefetches to move the big conditional statement out of draw_vbo Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	879c73fac8	radeonsi: update dirty_level_mask only after the first draw after FB change Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-30 13:27:14 +01:00
Marek Olšák	cf248929bf	radeonsi: use a global dirty mask for shader pointers Only vertex buffers use a separate bool flag. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	861d7af1cb	radeonsi: use a bitmask-based loop in si_decompress_textures Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	d93b0eacb7	radeonsi: si_cp_dma_prepare is a no-op for L2 prefetches Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Marek Olšák	395c49849d	radeonsi: add SI_CPDMA_SKIP_BO_LIST_UPDATE the next commit will use it in a clever way, because the CP DMA prefetch doesn't need this. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-18 19:51:31 +01:00
Bas Nieuwenhuizen	0ef1b4d5b1	ac/debug: Move IB decode to common code. Signed-off-by: Bas Nieuwenhuizen <basni@google.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Dave Airlie <airlied@redhat.com>	2017-01-09 21:43:59 +01:00
Marek Olšák	ece6e1f658	radeonsi: add TC L2 prefetch for shaders and VBO descriptors Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Marek Olšák	a131dacb14	radeonsi: add CP DMA flags for greater control over synchronization for L2 prefetch Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-01-06 21:05:48 +01:00
Nicolai Hähnle	2f2e941e2d	radeonsi: use a single descriptor for the GSVS ring We can hardcode all of the fields for swizzling in the geometry shader. The advantage is that we use fewer descriptor slots and we no longer have to update any of the (ring) descriptors when the geometry shader changes. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-12-12 09:05:05 +01:00
Marek Olšák	6caa558ca6	radeonsi: check for sampler state CSO corruption It really happens. v2: declare "magic" in debug builds only Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2016-12-07 19:40:03 +01:00
Marek Olšák	bf75ef3f92	radeonsi: remove all varyings for depth-only rendering or rasterization off Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	6d5c2a8b5c	radeonsi: split the shader key into 3 logical parts key->part.: prolog and epilog flags only key->as_{ls,es}: special flags key->mono.: flags for monolithic compilation only Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Marek Olšák	fa476e0566	radeonsi: fast exit si_emit_derived_tess_state early Tested-by: Edmondo Tommasina <edmondo.tommasina@gmail.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-11-21 21:44:35 +01:00
Nicolai Hähnle	908f92ad1f	radeonsi: generate GS prolog to (partially) fix triangle strip adjacency rotation Fixes GL45-CTS.geometry_shader.adjacency.adjacency_indiced_triangle_strip and others. This leaves the case of triangle strips with adjacency and primitive restarts open. It seems that the only thing that cares about that is a piglit test. Fixing this efficiently would be really involved, and I don't want to use the hammer of degrading to software handling of indices because there may well be software that uses this draw mode (without caring about the precise rotation of triangles). v2: - skip the GS prolog entirely if workaround is not needed - only check for TES (TES is always non-null when tessellation is used) Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:11:24 +01:00
Nicolai Hähnle	3b2516721b	radeonsi: make the GS copy shader owned by the GS selector The copy shader only depends on the selector. This change avoids creating separate code paths for monolithic vs. non-monolithic geometry shaders. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-11-03 10:07:50 +01:00
Marek Olšák	ad45dce4a2	radeonsi: remove si_resource_create_custom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	29144d0f34	gallium/radeon: stop using PIPE_BIND_CUSTOM it has no effect whatsoever Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-26 13:02:58 +02:00
Marek Olšák	a2ea653a49	radeonsi: remove cb0_is_integer handling st/mesa does this for us. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-19 19:26:30 +02:00
Marek Olšák	8cdce30cc2	radeonsi: implement TC L2 write-back (flush) without cache invalidation Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-10-12 18:29:40 +02:00
Marek Olšák	71a5cf6f3b	radeonsi: don't declare LDS in PS when ds_bpermute is used I guess this is not needed because dead code elimination removes the declaration. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:16 +02:00
Marek Olšák	3388f27d84	radeonsi: clean up lucky #include dependencies Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:06 +02:00
Marek Olšák	7ce19d9014	radeonsi: don't set sampler buffer offsets in create_sampler_view do it at bind time, so that pipe_sampler_view is immutable with regard to buffer reallocations and we don't have to remember all existing buffer views. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:12:01 +02:00
Marek Olšák	3ee9be42ac	radeonsi: move VGT_LS_HS_CONFIG to derived tess_state Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net>	2016-10-04 16:11:53 +02:00
Marek Olšák	a67d81580b	radeonsi: remove the cache_flush atom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-09 22:45:06 +02:00
Marek Olšák	fe40a65fb6	radeonsi: skip redundant INDEX_TYPE writes Ported from Vulkan. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-07 11:13:13 +02:00
Marek Olšák	911202817d	radeonsi: don't emit CS_PARTIAL_FLUSH if compute is not used for less noise in the HUD Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-09-05 18:01:15 +02:00
Nicolai Hähnle	b6c71d37c7	radeonsi: program the DRAWID SGPR Note that for indirect draws, the new MULTI firmware packets are required. There's also no need to reset last_{start_instance,sh_base_reg}, since resetting last_base_vertex is sufficient. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-09 15:56:04 +02:00
Nicolai Hähnle	96bbb620a5	radeonsi: add has_draw_indirect_multi flag Prefer to use DRAW_(INDEX)_INDIRECT_MULTI when available in the firmware. Versions for SI and CI already added as provided by the firmware team, but keep in mind that they won't currently be used since the radeon kernel module has no interface to query the firmware version. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-08-08 12:53:06 +02:00
Marek Olšák	a6bfafa083	gallium/radeon: move last_gfx_fence from radeonsi to common code Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Marek Olšák	c15a9dec29	radeonsi: skip unnecessary si_update_shaders calls Small decrease in draw call overhead. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-08-03 17:46:46 +02:00
Nicolai Hähnle	d938b8c0bf	radeonsi: explicitly choose center locations for 1xAA on Polaris Unlike SC, the small primitive filter does not automatically use center locations in 1xAA mode, so this is needed to avoid artifacts caused by the small primitive filter discarding triangles that it shouldn't. As a side effect of how the effective number of samples is now calculated, this patch also avoids submitting the sample locations for line/poly smoothing when they're not really needed. Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-07-08 10:52:50 +02:00
Marek Olšák	5c92c21369	radeonsi: do compilation from si_create_shader_selector asynchronously Main shader parts and geometry shaders are compiled asynchronously by util_queue. si_create_shader_selector doesn't wait and returns. si_draw_vbo(si_shader_select) waits for completion. This has the best effect when shaders are compiled at app-loading time. It doesn't help much for shaders compiled on demand, even though VS+PS compilation should take as much as time as the bigger one of the two. If an app creates more shaders, at most 4 threads will be used to compile them. Debug output disables this for shader stats to be printed in the correct order. (We could go even further and build variants asynchronously too, then emit draw calls without waiting and emit incomplete shader states, then force IB chaining to give the compiler more time, then sync the compilation at the IB flush and patch the IB with correct shader states. This is great for compilation before draw calls, but there are some difficulties such as scratch and tess states requiring the compiler output, and an on-disk shader cache will likely be a much better and simpler solution.) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	027ad71b57	radeonsi: print LLVM IRs to ddebug logs Getting LLVM IRs of hanging shaders have never been easier. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	28a03be06b	radeonsi: enable string markers and record apitrace call numbers Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-07-05 00:47:13 +02:00
Marek Olšák	eff81cbc81	radeonsi: enable distributed tess on multi-SE parts only ported from Vulkan Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Marek Olšák	dd56d04568	radeonsi: set optimal VGT_HS_OFFCHIP_PARAM ported from Vulkan Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-29 16:34:22 +02:00
Marek Olšák	3eacbc52d5	radeonsi: boolean -> bool, TRUE -> true, FALSE -> false Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Vedran Miletić <vedran@miletic.net> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-25 23:13:42 +02:00
Marek Olšák	28d0d0c5b4	radeonsi: fix fractional odd tessellation spacing for Polaris ported from Vulkan (and no source explains why this is needed) Cc: 12.0 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 17:36:43 +02:00
Marek Olšák	8f3ef4e8b8	radeonsi: optimize rendering to linear color buffers loosely ported from Vulkan Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-24 16:24:53 +02:00
Nicolai Hähnle	d46a9db840	radeon: check VM faults from DMA flush Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:03 +02:00
Nicolai Hähnle	ad8438403b	radeonsi: extract IB and bo list saving into separate functions Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-24 12:36:02 +02:00
Marek Olšák	4140afd04b	gallium/radeon: add driver queries for compute/dma call stats and spills also print the average count per frame Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2016-06-14 20:22:16 +02:00
Nicolai Hähnle	8239da28e8	radeonsi: keep track of dirty descriptor sets Reduces CPU load for draw calls that change none or few of the descriptors. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:18:10 +02:00
Nicolai Hähnle	d152c73712	radeonsi: move si_descriptors into a per-context array Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:18:07 +02:00
Nicolai Hähnle	031b57bc2f	radeonsi: move enabled_mask out of si_descriptors This mask is irrelevant for the generic descriptor set handling, and having it outside simplifies subsequent changes slightly. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2016-06-07 15:17:23 +02:00

1 2 3 4 5

215 Commits