KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Brian Paul	b29d827f09	st/mesa: move utility functions, macros into new st_util.h file To de-clutter st_context.h. Clean up remaining function prototypes in st_context.h. The st_vp_uses_current_values() helper is only used in st_context.c so move it there. The st_get_active_states() function is only used in st_context.c so remove its prototype in st_context.h Reviewed-by: Neha Bhende <bhenden@vmware.com>	2019-03-11 20:08:16 -06:00
Juan A. Suarez Romero	775aabdd01	anv: destroy descriptor sets when pool gets reset As stated in Vulkan spec: "Resetting a descriptor pool recycles all of the resources from all of the descriptor sets allocated from the descriptor pool back to the descriptor pool, and the descriptor sets are implicitly freed." This fixes dEQP-VK.api.descriptor_pool.* Fixes: `14f6275c92` "anv/descriptor_set: add reference counting for..." Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Tested-by: Clayton Craft <clayton.a.craft@intel.com>	2019-03-11 20:40:31 -05:00
Timothy Arceri	3235a942c1	nir: find induction/limit vars in iand instructions This will be used to help find the trip count of loops that look like the following: while (a < x && i < 8) { ... i++; } Where the NIR will end up looking something like this: vec1 32 ssa_1 = load_const (0x00000004 /* 0.000000 */) loop { ... vec1 1 ssa_12 = ilt ssa_225, ssa_11 vec1 1 ssa_17 = ilt ssa_226, ssa_1 vec1 1 ssa_18 = iand ssa_12, ssa_17 vec1 1 ssa_19 = inot ssa_18 if ssa_19 { ... break } else { ... } } On RADV this unrolls a bunch of loops in F1-2017 shaders. Totals from affected shaders: SGPRS: 4112 -> 4136 (0.58 %) VGPRS: 4132 -> 4052 (-1.94 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 515444 -> 587720 (14.02 %) bytes LDS: 2 -> 2 (0.00 %) blocks Max Waves: 194 -> 196 (1.03 %) Wait states: 0 -> 0 (0.00 %) It also unrolls a couple of loops in shader-db on radeonsi. Totals from affected shaders: SGPRS: 128 -> 128 (0.00 %) VGPRS: 64 -> 64 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 6880 -> 9504 (38.14 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 16 -> 16 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	67c3478482	nir: pass nir_op to calculate_iterations() Rather than getting this from the alu instruction this allows us some flexibility. In the following pass we instead pass the inverse op. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	11e8f8a166	nir: add get_induction_and_limit_vars() helper to loop analysis This helps make find_trip_count() a little easier to follow but will also be used by a following patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	f219f6114d	nir: add helper to return inversion op of a comparison This will be used to help find the trip count of loops that look like the following: while (a < x && i < 8) { ... i++; } Where the NIR will end up looking something like this: vec1 32 ssa_1 = load_const (0x00000004 /* 0.000000 */) loop { ... vec1 1 ssa_12 = ilt ssa_225, ssa_11 vec1 1 ssa_17 = ilt ssa_226, ssa_1 vec1 1 ssa_18 = iand ssa_12, ssa_17 vec1 1 ssa_19 = inot ssa_18 if ssa_19 { ... break } else { ... } } So in order to find the trip count we need to find the inverse of ilt. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	090feaacdc	nir: simplify the loop analysis trip count code a little Here we create a helper is_supported_terminator_condition() and use that rather than embedding all the trip count code inside a switch. The new helper will also be used in a following patch. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	7571de8eaa	nir: unroll some loops with a variable limit For some loops can have a single terminator but the exact trip count is still unknown. For example: for (int i = 0; i < imin(x, 4); i++) ... Shader-db results radeonsi (all affected are from Tropico 5): Totals from affected shaders: SGPRS: 144 -> 152 (5.56 %) VGPRS: 124 -> 108 (-12.90 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 5180 -> 6640 (28.19 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 17 -> 21 (23.53 %) Wait states: 0 -> 0 (0.00 %) Shader-db results i965 (SKL): total loops in shared programs: 3808 -> 3802 (-0.16%) loops in affected programs: 6 -> 0 helped: 6 HURT: 0 vkpipeline-db results RADV (Unrolls some Skyrim VR shaders): Totals from affected shaders: SGPRS: 304 -> 304 (0.00 %) VGPRS: 296 -> 292 (-1.35 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 15756 -> 25884 (64.28 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 29 -> 29 (0.00 %) Wait states: 0 -> 0 (0.00 %) v2: fix bug where last iteration would get optimised away by mistake. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	68ce0ec222	nir: calculate trip count for more loops This adds support to loop analysis for loops where the induction variable is compared to the result of min(variable, constant). For example: for (int i = 0; i < imin(x, 4); i++) ... We add a new bool to the loop terminator struct in order to differentiate terminators with this exit condition. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	e8a8937a04	nir: add partial loop unrolling support This adds partial loop unrolling support and makes use of a guessed trip count based on array access. The code is written so that we could use partial unrolling more generally, but for now it's only use when we have guessed the trip count. We use partial unrolling for this guessed trip count because its possible any out of bounds array access doesn't otherwise affect the shader e.g the stores/loads to/from the array are unused. So we insert a copy of the loop in the innermost continue branch of the unrolled loop. Later on its possible for nir_opt_dead_cf() to then remove the loop in some cases. A Renderdoc capture from the Rise of the Tomb Raider benchmark, reports the following change in an affected compute shader: GPU duration: 350 -> 325 microseconds shader-db results radeonsi VEGA (NIR backend): SGPRS: 1008 -> 816 (-19.05 %) VGPRS: 684 -> 432 (-36.84 %) Spilled SGPRs: 539 -> 0 (-100.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 39708 -> 45812 (15.37 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 105 -> 144 (37.14 %) Wait states: 0 -> 0 (0.00 %) shader-db results i965 SKL: total instructions in shared programs: 13098265 -> 13103359 (0.04%) instructions in affected programs: 5126 -> 10220 (99.38%) helped: 0 HURT: 21 total cycles in shared programs: 332039949 -> 331985622 (-0.02%) cycles in affected programs: 289252 -> 234925 (-18.78%) helped: 12 HURT: 9 vkpipeline-db results VEGA: Totals from affected shaders: SGPRS: 184 -> 184 (0.00 %) VGPRS: 448 -> 448 (0.00 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 26076 -> 24428 (-6.32 %) bytes LDS: 6 -> 6 (0.00 %) blocks Max Waves: 5 -> 5 (0.00 %) Wait states: 0 -> 0 (0.00 %) Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	fba5d275db	nir: add new partially_unrolled bool to nir_loop In order to stop continuously partially unrolling the same loop we add the bool partially_unrolled to nir_loop, we add it here rather than in nir_loop_info because nir_loop_info is only set via loop analysis and is intended to be cleared before each analysis. Also nir_loop_info is never cloned. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Timothy Arceri	03a452b7d0	nir: add guess trip count support to loop analysis This detects an induction variable used as an array index to guess the trip count of the loop. This enables us to do a partial unroll of the loop, which can eventually result in the loop being eliminated. v2: check if the induction var is used to index more than a single array and if so get the size of the smallest array. Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>	2019-03-12 00:52:30 +00:00
Tomeu Vizoso	97f2d04d5e	panfrost: Add support for PAN_MESA_DEBUG Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-12 00:30:27 +00:00
Tomeu Vizoso	f0b1bbebdd	panfrost/midgard: Add support for MIDGARD_MESA_DEBUG Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Reviewed-by: Alyssa Rosenzweig <alyssa@rosenzweig.io>	2019-03-12 00:30:27 +00:00
Xavier Bouchoux	c5236fc6e2	nir/spirv: Fix assert when unsampled OpTypeImage has unknown 'Depth' 'dxc' hlsl-to-spirv compiler appears to emit 2 (Unknown) in the depth field, when the image is not sampled and the value is not needed. Previously, shaders failed with: SPIR-V parsing FAILED: In file ../src/compiler/spirv/spirv_to_nir.c:1412 !is_shadow 632 bytes into the SPIR-V binary Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2019-03-11 23:28:39 +01:00
Kenneth Graunke	d75f84cb65	iris: Fix write enable in pinning of depth/stencil resources We may bind new Z/S buffers (which come via the framebuffer CSO, triggering IRIS_DIRTY_DEPTH_BUFFER), but with writes disabled. The next draw may enable Z or S writes (which come via the ZSA CSO, triggering IRIS_DIRTY_WM_DEPTH_STENCIL), which requires us to update our pin to have the write flag. So, update pinning if either dirty flag changes. To clarify, pass cso_zsa to the pinning function rather than pulling the random values out of ice->state, which unfortunately have to exist for the resolve code since iris_depth_stencil_alpha_state only exists in iris_state.c.	2019-03-11 15:04:08 -07:00
Kenneth Graunke	863e810a19	iris: Refactor depth/stencil buffer pinning into a helper. This avoids the code duplication that caused me to put things in the wrong place in the previous commit. One used to have extra flushes, but we moved those out so now these are identical and can be easily shared.	2019-03-11 15:04:08 -07:00
Kenneth Graunke	9302414f8b	iris: Move depth/stencil flushes so they actually do something Commit `d6dd57d43c` (iris: Add missing depth cache flushes) added the depth/stencil flushes to the wrong place. I meant to add them to the iris_upload_dirty_render_state code that emits the packets, but I accidentally added them to the nearly identical looking code in iris_restore_render_saved_bos. This meant we missed the actual flushing at draw time, but instead did pointless flushing on the first draw in a batch where things are already flushed anyway. This commit moves them to iris_resolve.c, next to the depth prepares, similar to what we do for color buffers. i965 does them elsewhere, but I'm not sure why - this seems like the most consistent place.	2019-03-11 15:04:08 -07:00
Christian Gmeiner	076a7095bb	st/dri: allow direct UYVY import Push this format to the pipe driver unchanged. Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-11 22:19:11 +01:00
Kenneth Graunke	04ff2e3fbb	iris: Fix TES gl_PatchVerticesIn handling. 1. If we switch the TCS for one with a different number of output vertices, then the TES's gl_PatchVerticesIn value will change. We need to re-upload in this case. For now, re-emit constants whenever the TCS/TES are swapped out. 2. If there is no TCS, then we can't grab gl_PatchVerticesIn from the TCS info. Since it's a passthrough, we can just use the primitive's patch count (like the TCS gl_PatchVerticesIn does). Fixes KHR-GL45.tessellation_shader.single.max_patch_vertices and KHR-GL45.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_PatchVerticesIn. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-11 14:07:16 -07:00
Kenneth Graunke	2f51cb5e67	iris: Rework default tessellation level uploads Now that we've added a system value uploading mechanism, we may as well reuse the same system for default tessellation levels. This simplifies the state upload code a bit. Also fixes: KHR-GL45.tessellation_shader.tessellation_control_to_tessellation_evaluation.gl_tessLevel Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>	2019-03-11 14:07:12 -07:00
Timur Kristóf	fd5075e059	iris: Face should be a system value. This patch adds PIPE_CAP_TGSI_FS_FACE_IS_INTEGER_SYSVAL which despite its name is not a TGSI-specific capability, just lets the state tracker know that it should generate a system value for FACE. This is needed if we want to run tgsi_to_nir on iris. Signed-off-by: Timur Kristóf <timur.kristof@gmail.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2019-03-11 14:02:40 -07:00
Eric Anholt	3a9e2d6085	vc4: Switch the post-RA scheduler over to the DAG datastructure. Just a small code reduction from shared infrastructure.	2019-03-11 13:14:37 -07:00
Eric Anholt	33886474d6	v3d: Use the DAG datastructure for QPU instruction scheduling. Just a small code reduction from shared infrastructure.	2019-03-11 13:14:32 -07:00
Eric Anholt	d6d83b34ee	vc4: Reuse list_for_each_entry_rev().	2019-03-11 13:14:32 -07:00
Eric Anholt	7c01ddbf7f	v3d: Reuse list_for_each_entry_rev().	2019-03-11 13:14:32 -07:00
Eric Anholt	7a727c1a12	vc4: Switch over to using the DAG datastructure for QIR scheduling. Just a small code reduction from shared infrastructure.	2019-03-11 13:14:18 -07:00
Eric Anholt	0533d2d95c	util: Add a DAG datastructure. I keep writing this for various schedulers. Acked-by: Timothy Arceri <tarceri@itsqueeze.com>	2019-03-11 13:13:52 -07:00
Kristian H. Kristensen	5f0a922c27	freedreno/a6xx: Remove extra parens There's a warning about this now. Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-11 11:37:53 -07:00
Kristian H. Kristensen	08c452bef7	freedreno: Use c_vis_args and no_override_init_args Signed-off-by: Kristian H. Kristensen <hoegsberg@chromium.org>	2019-03-11 11:37:53 -07:00
Chia-I Wu	24af64baa5	turnip: preliminary support for Wayland WSI	2019-03-11 10:02:13 -07:00
Chia-I Wu	ae82b5df88	turnip: preliminary support for tu_GetImageSubresourceLayout	2019-03-11 10:02:13 -07:00
Chad Versace	6cb5fd0d71	turnip: Use Vulkan 1.1 names instead of KHR That is, drop KHR from all tokens that were promoted to Vulkan 1.1. The consistency makes ctags more useful (it now jumps directly to the real definitions in vulkan_core.h instead of the typedefs); and it makes the code slightly less verbose.	2019-03-11 10:02:13 -07:00
Chia-I Wu	4f863dc0f7	turnip: guard -Dvulkan-driver=freedreno Require -DI-love-half-baked-turnips=true as well to enable freedreno vulkan driver.	2019-03-11 10:02:13 -07:00
Chia-I Wu	949ce2745d	turnip: preliminary support for tu_CmdDraw	2019-03-11 10:02:13 -07:00
Chia-I Wu	f9b34622cd	turnip: preliminary support for draw state binding This adds support for tu_CmdBindPipeline, tu_CmdBindVertexBuffers, etc.	2019-03-11 10:02:13 -07:00
Chia-I Wu	54b7a57c22	turnip: add draw_cs to tu_cmd_buffer It will hold draw commands.	2019-03-11 10:02:13 -07:00
Chia-I Wu	1cdbab016e	turnip: parse VkPipelineVertexInputStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	d17096b9b1	turnip: parse VkPipelineShaderStageCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	a7d842c97c	turnip: compile VkPipelineShaderStageCreateInfo Compile all shaders and upload the binaries to a BO.	2019-03-11 10:02:13 -07:00
Chia-I Wu	970a8fec96	turnip: preliminary support for shader modules Save SPIR-V in tu_shader_module. Tranlation to NIR happens in tu_shader_create, and compilation to binary code happens in tu_shader_compile. Both will be called during pipeline creation.	2019-03-11 10:02:13 -07:00
Chia-I Wu	9e0d878787	turnip: parse VkPipeline{Multisample,ColorBlend}StateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	bec0abf294	turnip: parse VkPipelineDepthStencilStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	9496b377ff	turnip: parse VkPipelineRasterizationStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	b4884761e8	turnip: parse VkPipelineViewportStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	1bea6a91cb	turnip: parse VkPipelineInputAssemblyStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	c584c2e86c	turnip: parse VkPipelineDynamicStateCreateInfo	2019-03-11 10:02:13 -07:00
Chia-I Wu	df48cb7b3e	turnip: create a less dummy pipeline Still dummy, but at least it is created from tu_pipeline_builder.	2019-03-11 10:02:13 -07:00
Chia-I Wu	57327626dc	turnip: simplify tu_cs sub-streams usage Let tu_cs_begin_sub_stream imply tu_cs_reserve_space, and tu_cs_end_sub_stream imply tu_cs_sanity_check. Callers are no longer required to call them (but can still do if they choose to).	2019-03-11 10:02:13 -07:00
Chia-I Wu	59419bb691	turnip: fix tu_cs sub-streams Update cs->start in tu_cs_end_sub_stream. Otherwise, the entry would include commands from all prior sub-streams.	2019-03-11 10:02:13 -07:00

1 2 3 4 5 ...

109130 Commits All Branches Search

109130 Commits

All Branches