KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Sonny Jiang	ce1d72609d	radeonsi:optimizing SET_CONTEXT_REG for shaders Tessellation Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Sonny Jiang	4de328da07	radeonsi:optimizing SET_CONTEXT_REG for shaders PS Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Sonny Jiang	f243980f2c	radeonsi:optimizing SET_CONTEXT_REG for shaders VS Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Sonny Jiang	4052624398	radeonsi:optimizing SET_CONTEXT_REG for shaders GS Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 19:04:13 -04:00
Sonny Jiang	eeb9170599	radeonsi: optimizing SET_CONTEXT_REG for shaders ES Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-10-05 17:53:52 -04:00
Marek Olšák	0b062f0419	radeonsi: don't set the VS prolog key for the blit VS	2018-10-02 12:21:49 -04:00
Marek Olšák	5693ca865d	radeonsi: bump MAX_GS_INVOCATIONS same as the closed driver Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-08-23 16:56:17 -04:00
Marek Olšák	ac72a6bd0b	radeonsi: move internal TGSI shaders into si_shaderlib_tgsi.c Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>	2018-08-14 21:20:31 -04:00
Marek Olšák	86b52d4236	radeonsi: reduce LDS stalls by 40% for tessellation 40% is the decrease in the LGKM counter (which includes SMEM too) for the GFX9 LSHS stage. This will make the LDS size slightly larger, but I wasn't able to increase the patch stride without corruption, so I'm increasing the vertex stride.	2018-07-23 20:23:52 -04:00
Sonny Jiang	c6737756ad	radeonsi: emit_spi_map packets optimization v2: marek: remove an empty line before break; rename reg_val_seq -> spi_ps_input_cntl "type * x" -> "type *x" Signed-off-by: Sonny Jiang <sonny.jiang@amd.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-07-20 13:50:26 -04:00
Dave Airlie	0eb65b4944	radeonsi: rename si_compiler -> ac_llvm_compiler As precursor to moving init to common code, just rename the struct and move it. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-07-04 05:31:32 +10:00
Marek Olšák	1542169a4a	radeonsi: enable shader caching for compute shaders Compute shaders were not using the shader cache.	2018-06-28 22:27:25 -04:00
Marek Olšák	d13f240269	radeonsi: unify duplicated code for initial shader compilation	2018-06-28 22:27:25 -04:00
Marek Olšák	f154555733	radeonsi: clean up passing the is_monolithic flag for compilation Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-06-25 18:33:58 -04:00
Marek Olšák	6703fec58c	amd,radeonsi: rename radeon_winsys_cs -> radeon_cmdbuf Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-06-19 13:08:50 -04:00
Marek Olšák	22e994bb75	radeonsi: assume that rasterizer state is non-NULL in draw_vbo Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:36 -04:00
Marek Olšák	f3b3ee6974	radeonsi: micro-optimize prim checking and fix guardband with lines+adjacency Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:34 -04:00
Marek Olšák	28ee825e19	radeonsi: move VGT_GS_OUT_PRIM_TYPE into si_shader_gs same as amdvlk. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:23 -04:00
Marek Olšák	99e0ba6868	radeonsi: record CLIPVERTEX output usage properly for compatibility profiles This was missed when adding CLIPVERTEX support into GS & tess. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-06-13 22:00:20 -04:00
Marek Olšák	2f65c67043	radeonsi: fix passing gl_ClipVertex for GS and tess Also add the fprintf call. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	a7d61c0753	radeonsi: fix color inputs/outputs for GS and tess GS is tested, tessellation is untested. Have outputs_written_before_ps for HW VS and outputs_written for other stages. The reason is that COLOR and BCOLOR alias for HW VS, which drives elimination of VS outputs based on PS inputs. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	92ea9329e5	radeonsi: fix incorrect parentheses around VS-PS varying elimination I don't know if it caused issues. Cc: 18.0 18.1 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-25 16:46:00 -04:00
Marek Olšák	07e02c8617	radeonsi: round ps_iter_samples in set_min_samples Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-05-24 13:41:57 -04:00
Marek Olšák	87eb597758	radeonsi: add struct si_compiler containing LLVMTargetMachineRef It will contain more variables. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Tested-by: Benedikt Schemmer <ben at besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	6fadfc01c6	radeonsi: use r600_resource() typecast helper Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	3160ee876a	radeonsi: remove unused atom parameter from si_atom::emit Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	e395475096	radeonsi: remove function si_init_atom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	639b673fc3	radeonsi: don't use an indirect table for state atoms Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	9054799b39	radeonsi: rename r600_atom -> si_atom Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-27 17:56:04 -04:00
Marek Olšák	60299e9abe	radeonsi: don't emit partial flushes for internal CS flushes only Tested-by: Benedikt Schemmer <ben@besd.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-04-16 16:58:10 -04:00
Marek Olšák	6a93441295	radeonsi: remove r600_common_context Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5777488406	radeonsi: move r600_cs.h contents into si_pipe.h, si_build_pm4.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	72e9e98076	radeonsi: move and rename R600_ERR out of r600_pipe_common.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	5f1cddde78	radeonsi: move definitions out of r600_pipe_common.h Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	c424f86180	radeonsi: use si_context instead of pipe_context in parameters pt1 Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	4c5efc40f4	radeonsi: update copyrights Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	95bc30275b	radeonsi: switch radeon_add_to_buffer_list parameter to si_context Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	2b70dd8c8a	radeonsi: flatten / remove struct r600_ring Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	17e8f1608e	radeonsi: call CS flush functions directly whenever possible Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2018-04-05 15:34:58 -04:00
Marek Olšák	0669dca9c0	radeonsi: skip DCC render feedback checking if color writes are disabled	2018-04-05 15:34:58 -04:00
Marek Olšák	2be6143032	radeonsi: implement GL_KHR_blend_equation_advanced MSAA is supported using sample shading. Layered rendering and all texture targets are also supported. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2018-04-02 13:55:25 -04:00
Marek Olšák	9b7db12815	radeonsi: remove chip_class parameter from si_lower_nir We can get it from si_screen. Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com> Acked-by: Alex Deucher <alexander.deucher@amd.com>	2018-03-08 14:58:16 -05:00
Marek Olšák	2e30268877	radeonsi: mask out high VM address bits in registers where needed	2018-03-07 13:55:35 -05:00
Timothy Arceri	70190a6567	radeonsi/nir: call ac_lower_indirect_derefs() Fixes piglit tests: tests/spec/glsl-1.50/execution/variable-indexing/gs-input-array-vec3-index-rd.shader_test tests/spec/glsl-1.50/execution/geometry/max-input-components.shader_test Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-05 14:09:23 +11:00
Timothy Arceri	561503e3bd	radeonsi: add chip class to compiler_ctx_state This will be used in the following patch. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2018-03-05 14:09:23 +11:00
Marek Olšák	8799eaed99	radeonsi: remove 2 unused user SGPRs from merged TES-GS with 32-bit pointers The effect of the last 13 commits on user SGPR counts: Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:19 +01:00
Marek Olšák	3fa7a59d69	radeonsi: make SI_SGPR_VERTEX_BUFFERS the last user SGPR input so that it can be removed and replaced with inline VBO descriptors, and the pointer can be packed in unused bits of VBO descriptors. This also removes the pointer from merged TES-GS where it's useless. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-26 12:01:08 +01:00
Marek Olšák	2d03c4cac8	radeonsi: move tess ring address into TCS_OUT_LAYOUT, removes 2 TCS user SGPRs TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address aligned to 512KB. Hey, it's a 13-bit pointer! Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:29 +01:00
Marek Olšák	fca7dee9c6	radeonsi: put both tessellation rings into 1 buffer Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Marek Olšák	d2963d8b5f	radeonsi: move tessellation ring info into si_screen Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-02-24 23:08:28 +01:00
Timothy Arceri	691c320de0	radeonsi: add nir shader cache support In future we might want to try avoid calling nir_serialize() but this works for now. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 13:15:02 +11:00
Timothy Arceri	2b431808ab	radeonsi: rename variables tgsi_binary -> ir_binary This better represents that the ir could be either tgsi or nir. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-02-20 13:15:02 +11:00
Marek Olšák	fdf01d0244	radeonsi: remove DBG_PRECOMPILE it's useless and shader-db stats only report the main shader part. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-31 03:21:20 +01:00
Marek Olšák	148b48646b	radeonsi: print shader-db stats for main parts, not final binaries This is needed to get shader-db stats for LS,HS,ES,GS stages on gfx9. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2018-01-31 03:21:20 +01:00
Timothy Arceri	452586b56a	radeonsi: add dummy implementation of si_nir_scan_tess_ctrl() Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2018-01-05 11:58:55 +11:00
Józef Kucia	f222cf3c6d	radeonsi: fix alpha-to-coverage if color writes are disabled If alpha-to-coverage is enabled, we have to compute alpha even if color writes are disabled. Signed-off-by: Józef Kucia <joseph.kucia@gmail.com> Signed-off-by: Marek Olšák <marek.olsak@amd.com>	2018-01-04 01:58:33 +01:00
Samuel Pitoiset	79b34d0832	amd/common: add ac_vgt_gs_mode() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-18 11:50:50 +01:00
Samuel Pitoiset	55f8431c76	amd/common: add ac_get_cb_shader_mask() helper Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-18 11:50:48 +01:00
Samuel Pitoiset	45872a0a6d	radeonsi: make use of ac_get_spi_shader_z_format() Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>	2017-12-14 22:23:25 +01:00
Marek Olšák	2c5f2936af	r300,r600,radeonsi: replace RADEON_FLUSH_* with PIPE_FLUSH_* and handle PIPE_FLUSH_HINT_FINISH in r300. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	950221f923	radeonsi: remove r600_common_screen Most files in gallium/radeon now include si_pipe.h. chip_class and family are now here: sscreen->info.family sscreen->info.chip_class Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	2208b760f3	radeonsi: move shader debug helpers out of r600_pipe_common.c Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Marek Olšák	c63e225bff	radeonsi: remove some definitions and helpers from r600_pipe_common.h Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-29 18:21:30 +01:00
Nicolai Hähnle	f76a6cb337	radeonsi: always use async compiles when creating shader/compute states With Gallium threaded contexts, creating shader/compute states is effectively a screen operation, so we should not use context state. In particular, this allows us to avoid using the context's LLVM TargetMachine. This isn't an issue yet because u_threaded_context filters out non-async debug callbacks, and we disable threaded contexts for debug contexts. However, we may want to change that in the future. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:53:20 +01:00
Nicolai Hähnle	dd7c273e87	radeonsi: move pipe debug callback to si_context Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:53:19 +01:00
Nicolai Hähnle	0f54ee6072	radeonsi: reduce the scope of sel->mutex in si_shader_select_with_key We only need the lock to guard changes in the variant linked list. The actual compilation can happen outside the lock, since we use the ready fence as a guard. v2: fix double-unlock Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:37:51 +01:00
Nicolai Hähnle	4f493c79ee	radeonsi: use ready fences on all shaders, not just optimized ones There's a race condition between si_shader_select_with_key and si_bind_XX_shader: Thread 1 Thread 2 -------- -------- si_shader_select_with_key begin compiling the first variant (guarded by sel->mutex) si_bind_XX_shader select first_variant by default as state->current si_shader_select_with_key match state->current and early-out Since thread 2 never takes sel->mutex, it may go on rendering without a PM4 for that shader, for example. The solution taken by this patch is to broaden the scope of shader->optimized_ready to a fence shader->ready that applies to all shaders. This does not hurt the fast path (if anything it makes it faster, because we don't explicitly check is_optimized). It will also allow reducing the scope of sel->mutex locks, but this is deferred to a later commit for better bisectability. Fixes dEQP-EGL.functional.sharing.gles2.multithread.simple.buffers.bufferdata_render Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-11-09 11:37:51 +01:00
Marek Olšák	529cdce799	radeonsi: remove 'Authors:' comments It's inaccurate. Instead, see the copyright and use "git log" and "git blame" to know the authorship. Acked-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-11-02 18:19:03 +01:00
Marek Olšák	da0083f123	radeonsi: use postponed KILL only when derivatives are used Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-24 14:56:34 +02:00
Marek Olšák	65f2e33500	radeonsi: import r600_streamout from drivers/radeon Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-09 16:26:55 +02:00
Marek Olšák	3784ce9782	radeonsi: enumerize DBG flags Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-09 16:20:16 +02:00
Marek Olšák	5a47abb63e	radeonsi: don't change viewport for blits, use window-space positions The viewport state was an identity anyway. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-07 18:26:35 +02:00
Marek Olšák	13b6c1c031	radeonsi: minor cleanup of si_update_vs_writes_viewport_index Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-07 18:26:35 +02:00
Marek Olšák	69ccb9dae7	radeonsi: use new VS blit shaders (VS inputs in SGPRs) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-07 18:26:35 +02:00
Marek Olšák	6a8401a94e	radeonsi: add VS blit shader creation no users yet Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-10-07 18:26:35 +02:00
Nicolai Hähnle	12f3155e28	radeonsi: simplify the signature of si_update_vs_writes_viewport_index Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-02 15:07:45 +02:00
Nicolai Hähnle	7bbcb6ac6c	radeonsi: move current_rast_prim into si_context v2: rebase fixes Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-02 15:07:45 +02:00
Nicolai Hähnle	6b416ec3d6	radeonsi: move and rename scissor and viewport state and functions v2: change GET_MAX_SCISSOR to SI_MAX_SCISSOR Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-02 15:07:45 +02:00
Nicolai Hähnle	f86a112b07	radeonsi: move current_rast_prim to r600_common_context We'll use it in the scissors / clip / guardband state. v2: avoid a performance regression on r600 when applied to (pre-fork) stable branches Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-10-02 15:07:43 +02:00
Nicolai Hähnle	7dfa891f32	radeonsi/gfx9: fix geometry shaders without output vertices Not that those are super common or useful, but hey! Fun corner cases of the API... Fixes dEQP-GLES31.functional.geometry_shading.emit.* Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-29 11:43:09 +02:00
Marek Olšák	06bfb2d28f	r600: fork and import gallium/radeon This marks the end of code sharing between r600 and radeonsi. It's getting difficult to work on radeonsi without breaking r600. A lot of functions had to be renamed to prevent linker conflicts. There are also minor cleanups. Acked-by: Dave Airlie <airlied@redhat.com> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-26 04:21:14 +02:00
Nicolai Hähnle	aab134cfa5	radeonsi: enable out-of-order rasterization when possible on VI and GFX9 dGPUs This does not take commutative blending into account yet. R600_DEBUG=nooutoforder disables it. Reviewed-by: Marek Olšák <marek.olsak@amd.com> Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de>	2017-09-18 11:25:19 +02:00
Nicolai Hähnle	e4af4433fc	radeonsi: hard-code pixel center for interpolateAtSample without multisample buffers The GLSL rules for interpolateAtSample are unfortunate: "Returns the value of the input interpolant variable at the location of sample number sample. If multisample buffers are not available, the input variable will be evaluated at the center of the pixel. If sample sample does not exist, the position used to interpolate the input variable is undefined." This fix will fallback to monolithic shader compilation when interpolateAtSample is used without multisampling. One alternative would be to always upload 16 sample positions, filling the buffer up with repetition when the actual number of samples is less, and then ANDing the sample ID with 0xf. However, that punishes all well-behaving users of interpolateAtSample, when in reality, only conformance tests should be affected by the issue. Fixes dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at_sample.non_multisample_buffer.* Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-13 18:25:45 +02:00
Nicolai HÃÂ¤hnle	92c4277990	radeonsi: apply a mask to gl_SampleMaskIn in the PS prolog gl_SampleMaskIn is supposed to contain set bits only for the samples that are covered by the current fragment shader invocation, but the VGPR initialization hardware loads the set of all bits that are covered at the current pixel. Fixes various tests in dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.* Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-13 18:25:41 +02:00
Nicolai Hähnle	48b3364b5b	radeonsi: make si_init_shader_selector_async static Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-13 18:24:18 +02:00
Marek Olšák	6eade342eb	radeonsi: optimize TCS epilog when invocation 0 writes tess factors This removes the barrier and LDS stores and loads for tess factors when it's possible. The removal of the barrier seems more important to me though. In one shader, it removes 17 * 4 bytes from the shader binary. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-11 19:02:02 +02:00
Marek Olšák	89bf8668c2	radeonsi/gfx9: don't read LS out vertex stride from an SGPR in monolithic HS -44 bytes in a monolithic LS-HS binary. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-07 13:00:07 +02:00
Nicolai Hähnle	45c5c44451	radeonsi/gfx9: proper workaround for LS/HS VGPR initialization bug When the HS wave is empty, the hardware writes the LS VGPRs starting at v0 instead of v2. Workaround by shifting them back into place when necessary. For simplicity, this is always done in the LS prolog. According to the hardware team, this will be fixed in future chips, so take that into account already. Note that this is not a bug fix, as the bug was already worked around by commit `166823bfd2` ("radeonsi/gfx9: add a temporary workaround for a tessellation driver bug"). This change merely replaces the workaround by one that should be better. v2: add workaround code to shader only when necessary v3: clarify the prefer_mono comment Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-09-06 10:02:49 +02:00
Marek Olšák	c3ebac6890	radeonsi/gfx9: implement primitive binning This increases performance, but it was tuned for Raven, not Vega. We don't know yet how Vega will perform, hopefully not worse. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-05 12:09:02 +02:00
Marek Olšák	fb7ba68f6c	radeonsi: eliminate PS color outputs when colormask kills them Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-09-04 15:10:39 +02:00
Timothy Arceri	0168d1f449	radeonsi: stop leaking nir Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-29 09:46:29 +10:00
Timothy Arceri	ea2515d780	glsl: pass shader source keys to the disk cache We don't actually write them to disk here. That will happen in the following commit. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-25 13:20:29 +10:00
Marek Olšák	8dadb07790	radeonsi: emit VGT_REUSE_OFF in the right place clip_regs aren't marked dirty when writes_viewport_index is changed. Cc: 17.2 <mesa-stable@lists.freedesktop.org> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	54c2c771bd	radeonsi/gfx9: don't use GS scenario A for VS writing ViewportIndex Vulkan doesn't do it anymore. Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-22 13:29:47 +02:00
Marek Olšák	a65afda768	radeonsi/gfx9: prevent shader-db crashes - don't precompile LS and ES (they don't exist on GFX9), compile as VS instead - don't precompile HS and GS (we don't have LS and ES parts) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-22 13:29:47 +02:00
Nicolai Hähnle	40697e8678	radeonsi: make si_shader_selector_reference globally visible Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-08-22 09:50:55 +02:00
Marek Olšák	e887c68bd2	radeonsi: add a separate dirty mask for prefetches so that we don't rely on si_pm4_state_enabled_and_changed, allowing us to move prefetches after draw calls. v2: ckear the dirty mask after unbinding shaders Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> (v1) Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com> (v1)	2017-08-07 21:12:24 +02:00
Marek Olšák	a7b0014d1a	radeonsi: add and use si_pm4_state_enabled_and_changed Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-07 21:12:24 +02:00
Marek Olšák	58d062b87d	radeonsi: de-atomize L2 prefetch I'd like to be able to move the prefetch call site around. Tested-by: Dieter Nützel <Dieter@nuetzel-hh.de> Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>	2017-08-07 21:12:24 +02:00
Nicolai Hähnle	25ff22e390	radeonsi: tweak next-shader assumptions when streamout is used VS with streamout is always a HW VS. Reviewed-by: Marek Olšák <marek.olsak@amd.com>	2017-07-31 14:55:43 +02:00

1 2 3 4 5 ...

469 Commits