KonstantinSeurer/mesa

Commit Graph

Author	SHA1	Message	Date
Jason Ekstrand	82695c32b6	anv: Add helpers for converting access flags to pipe bits Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-04-07 19:24:14 -07:00
Jason Ekstrand	f3673db3d6	anv/cmd_buffer: Refactor flush_pipeline_select_* While having the _3d and _gpgpu versions is nice, there's no reason why we need to have duplicated logic for tracking the current pipeline. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>	2017-03-28 14:57:09 -07:00
Jason Ekstrand	6baae9625d	anv: Flush caches prior to PIPELINE_SELECT on all gens The programming note that says we need to do this still exists in the SkyLake PRM and, from looking at the bspec, seems like it may apply to all hardware generations SNB+. Unfortunately, this isn't particularly clear cut since there is also language in the bspec that says you can skip the flushing and stall to get better throughput. Experimentation with the "Car Chase" benchmark in GL seems to indicate that some form of flushing is still needed. This commit makes us do the full set of flushes regardless of hardware generation. We can always reduce the flushing later. Reported-by: Topi Pohjolainen <topi.pohjolainen@intel.com> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-03-28 14:57:08 -07:00
Jason Ekstrand	0fe3dcce4c	anv/cmd_buffer: Fix bad indentation A bunch of code was indented in such a way that it looked like it went with the if statement above but it definitely didn't. Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Cc: "17.0 13.0" <mesa-stable@lists.freedesktop.org>	2017-03-28 14:57:06 -07:00
Jason Ekstrand	01a65dc43b	anv/cmd_buffer: Apply flush operations prior to executing secondaries This fixes rendering issues in the Vulkan port of skia on some hardware. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-03-28 14:56:55 -07:00
Samuel Iglesias Gonsálvez	c4c02471f4	anv: enable sampling from fast-cleared images on SKL A resolve is not needed on Skylake in this case. We were forcing a resolve because we set the input_aux_usage to ISL_AUX_USAGE_NONE. Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-03-27 06:32:24 +02:00
Iago Toral Quiroga	1d7468311d	anv: handle errors in emit_binding_table() and emit_samplers() These can fail to allocate device memory, however, the driver can recover from this error by allocating a new binding table block and trying again. v2: - Instead of tracking the errors in these functions and making callers reset the batch's status before attempting to allocate a new block for the binding table, simply make callers responsible for setting the error status if they fail to allocate memory during the second attempt (Jason). Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	dd8348c8be	anv: handle errors while allocating new binding table blocks Also, we had a couple of instances in flush_descriptor_sets() were we were returning a VkResult directly upon error, but the return value of this function is not a VkResult but a uint32_t dirty mask, so simply return 0 in these cases which reduces the amount of work the driver will do after the error has been raised. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	a752c4ecda	anv/cmd_buffer: skip vkCmdExecuteCommands() on broken command buffers v2: Assert on secondary commands, applications should've called vkEndCommandBuffer() and received an error for them before (Jason) Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	801493051e	anv/cmd_buffer: skip vkCmdDispatch() on broken command buffers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	18ec3fa2a9	anv/cmd_buffer: skip vkCmdDraw*() on broken command buffers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	94a4f0c255	anv/cmd_buffer: handle allocation errors during vkCmdBeginRenderPass() Fixes: dEQP-VK.api.out_of_host_memory.cmd_begin_render_pass Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	d823f381a5	anv/cmd_buffer: skip vkCmdEndRenderPass() for broken command buffers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	6743456699	anv/cmd_buffer: skip vkCmdNextSubpass() for broken command buffers Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	8174e63869	anv/cmd_buffer: report tracked errors in vkEndCommandBuffer() Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	68d88f0237	anv: handle failures when growing reloc lists Growing the reloc list happens through calling anv_reloc_list_add() or anv_reloc_list_append(). Make sure that we call these through helpers that check the result and set the batch error status if needed. v2: - Handling the crashes is not good enough, we need to keep track of the error, for that, keep track of the errors in the batch instead (Jason). - Make reloc list growth go through helpers so we can have a central place where we can do error tracking (Jason). v3: - Callers that need the offset returned by anv_reloc_list_add() can compute it themselves since it is extracted from the inputs to the function, so change the function to return a VkResult, make anv_batch_emit_reloc() also return a VkResult and let their callers do the error management (Topi) v4: - Let anv_batch_emit_reloc() return an uint64_t as it originally did, there is no real benefit in having it return a VkResult. - Do not add an is_aux parameter to add_surface_state_reloc(), instead do error checking for aux in add_image_view_relocs() separately. Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Iago Toral Quiroga	6dd06f54eb	anv/cmd_buffer: report errors in vkBeginCommandBuffer() Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2017-03-16 11:40:05 +01:00
Jason Ekstrand	dd4db84640	anv: Use on-the-fly surface states for dynamic buffer descriptors We have a performance problem with dynamic buffer descriptors. Because we are currently implementing them by pushing an offset into the shader and adding that offset onto the already existing offset for the UBO/SSBO operation, all UBO/SSBO operations on dynamic descriptors are indirect. The back-end compiler implements indirect pull constant loads using what basically amounts to a texelFetch instruction. For pull constant loads with constant offsets, however, we use an oword block read message which goes through the constant cache and reads a whole cache line at a time. Because of these two things, direct pull constant loads are much faster than indirect pull constant loads. Because all loads from dynamically bound buffers are indirect, the user takes a substantial performance penalty when using this "performance" feature. There are two potential solutions I have seen for this problem. The alternate solution is to continue pushing offsets into the shader but wire things up in the back-end compiler so that we use the oword block read messages anyway. The only reason we can do this because we know a priori that the dynamic offsets are uniform and 16-byte aligned. Unfortunately, thanks to the 16-byte alignment requirement of the oword messages, we can't do some general "if the indirect offset is uniform, use an oword message" sort of thing. This solution, however, is recommended for a few of reasons: 1. Surface states are relatively cheap. We've been using on-the-fly surface state setup for some time in GL and it works well. Also, dynamic offsets with on-the-fly surface state should still be cheaper than allocating new descriptor sets every time you want to change a buffer offset which is really the only requirement of the dynamic offsets feature. 2. This requires substantially less compiler plumbing. Not only can we delete the entire apply_dynamic_offsets pass but we can also avoid having to add architecture for passing dynamic offsets to the back- end compiler in such a way that it can continue using oword messages. 3. We get robust buffer access range-checking for free. Because the offset and range are baked into the surface state, we no longer need to pass ranges around and do bounds-checking in the shader. 4. Once we finally get UBO pushing implemented, it will be much easier to handle pushing chunks of dynamic descriptors if the compiler remains blissfully unaware of dynamic descriptors. This commit improves performance of The Talos Principle on ULTRA settings by around 50% and brings it nicely into line with OpenGL performance. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-13 07:58:00 -07:00
Jason Ekstrand	0421813588	anv: Make the framebuffer-renderpass format assert non-fatal This should let Dota 2 run on debug builds though it will spew errors like mad. Hopefully, Valve will get this fixed sooner rather than later. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-03-07 15:22:16 -08:00
Nanley Chery	76b8cc2a1c	anv/cmd_buffer: Centralize automatic layout transitions Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	0a72b5f3cb	anv/cmd_buffer: Add attachment transitioning functions This is needed to transition input attachments. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	c78a959bcf	anv/cmd_buffer: Enable render pass awareness v2: Update cmd_state_reset (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	608d17b80e	anv: Store the user's VkAttachmentReference We will be using the image layout. Store the full struct directly from the user. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	6326f0f4be	anv/cmd_buffer: Remove extra resolve for certain depth buffers Due to recent commits, the sampler now bypasses the auxiliary HiZ buffer when reading from a depth image subresource that is in the general layout. Remove this unneeded resolve. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	ea744912b3	anv/cmd_buffer: Conditionally choose the sampled image surface state Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:55 -08:00
Nanley Chery	54d29ee65f	anv: Update the HiZ sampling helper Validate the inputs, verify that this image has a depth buffer, use gen_device_info instead of v2: - Add parenthesis (Jason Ekstrand) - Make parameters const - Use gen_device_info instead of gen - Pass aspect to missed function in transition_depth_buffer Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Nanley Chery	172747a963	anv/cmd_buffer: Replace layout_to_hiz_usage() Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 13:17:54 -08:00
Iago Toral Quiroga	7ad692d8e2	anv: do not subtract the base layer to compute depth in 3DSTATE_DEPTH_BUFFER According to the PRM description of the Depth field: "This field specifies the total number of levels for a volume texture or the number of array elements allowed to be accessed starting at the Minimum Array Element for arrayed surfaces" However, ISL defines array_len as the length of the range [base_array_layer, base_array_layer + array_len], so it already represents a value relative to the base array layer like the hardware expects. v2: Depth is defined as a U11-1 field, so subtract 1 from the actual value (Jason) This fixes a number of new CTS tests that would crash otherwise: dEQP-VK.pipeline.render_to_image.* Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-03-02 09:04:03 +01:00
Jason Ekstrand	261092f7d4	anv: Enable MSAA compression This just enables basic MSAA compression (no fast clears) for all multisampled surfaces. This improves the framerate of the Sascha "multisampling" demo by 76% on my Sky Lake laptop. Running Talos on medium settings with 8x MSAA, this improves the framerate in the benchmark by 80%. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org>	2017-02-23 12:10:42 -08:00
Jason Ekstrand	f31ed6d0cd	anv: Take a device parameter in anv_state_flush This allows the helper to check for llc instead of having to do it manually at all the call sites. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	f9d7d27d6d	anv: Rename clflush_range and state_clflush It's a bit shorter and easier to work with. Also, we're about to add a helper called clflush which does the clflush but without any memory fencing. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	b6b03329af	anv: Put everything about queries in genX_query.c Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	40087bcb51	anv/query: Perform CmdResetQueryPool on the GPU This fixes a some rendering corruption in The Talos Principle Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-02-21 12:26:35 -08:00
Jason Ekstrand	e8d52dab48	anv: Add support for the PMA fix on Broadwell This helps Dota 2 on Broadwell by 8-9%. I also hacked up the driver and used the Sascha "shadowmapping" demo to get some results. Setting uses_kill to true dropped the framerate on the demo by 25-30%. Enabling the PMA fix brought it back up to around 90% of the original framerate. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-14 14:18:55 -08:00
Alex Smith	924a8cbb40	anv: Add support for shaderStorageImageWriteWithoutFormat This allows shaders to write to storage images declared with unknown format if they are decorated with NonReadable ("writeonly" in GLSL). Previously an image view would always use a lowered format for its surface state, however when a shader declares a write-only image, we should use the real format. Since we don't know at view creation time whether it will be used with only write-only images in shaders, create two surface states using both the original format and the lowered format. When emitting the binding table, choose between the states based on whether the image is declared write-only in the shader. Tested on both Sascha Willems' computeshader sample (with the original shaders and ones modified to declare images writeonly and omit their format qualifiers) and on our own shaders for which we need support for this. Signed-off-by: Alex Smith <asmith@feralinteractive.com> Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-14 08:16:52 -08:00
Jason Ekstrand	0ef14cdc98	anv/cmd_buffer: Return a VkResult from verify_cmd_parser This fixes a "statement with no effect" compiler warning Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2017-02-14 07:50:13 -08:00
Nanley Chery	93b819154f	anv/cmd_buffer: Don't temporarily enable CCS_E within a render pass Compressing a render target and decompressing it in the same single-subpass render pass may waste bandwidth. While this may be beneficial in some circumstances, it does not help in all. Reclaims about 1.95% FPS for Dota 2 on some configurations. v2 (Jason Ekstrand): - Provide a more thorough comment - Enable CCS_D for input attachments v3 (Jason Ekstrand): - Provide performance numbers Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-03 09:23:13 -08:00
Jason Ekstrand	ab06fc6684	intel/isl: Rename supports_lossless_compression to supports_ccs_e The term "lossless compression" could potentially mean multisample color compression, single-sample color compression or HiZ because they are all lossless. The term CCS_E, however, has a very precise meaning; in ISL and is only used to refer to single-sample color compression. It's also much shorter which is nice. Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2017-02-02 13:33:43 -08:00
Lionel Landwerlin	9413e11869	anv: emit DrawID if needed v2: use define for buffer ID (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-02 01:33:06 +00:00
Lionel Landwerlin	289aef771d	anv: move BaseVertexID/BaseInstanceID vertex buffer index to 31 v2: use define for buffer ID (Jason) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-02-02 01:32:48 +00:00
Jason Ekstrand	ccdd5b3738	anv: Don't use bogus alpha swizzles For RGB formats in Vulkan, we use the corresponding RGBA format with a swizzle of RGB1. While this swizzle is exactly what we want for texturing, it's not allowed for rendering according to the docs. While we haven't been getting hangs or anything, we should probably obey the docs. This commit just sanitizes all render swizzles so that the alpha channel maps to ALPHA. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2017-02-01 14:41:06 -08:00
Jason Ekstrand	92128590bc	anv: Improve flushing around STATE_BASE_ADDRESS It is not clear from the docs exactly how pipelined STATE_BASE_ADDRESS actually is. We know from experimentation that we need to flush the render cache prior to emitting STATE_BASE_ADDRESS and invalidate the texture cache afterwards. The only thing the PRM says is that, on gen8+ we're supposed to invalidate the state cache after STATE_BASE_ADDRESS but experimentation has indicated that doing so does nothing whatsoever. Since we don't really know, let's do just a bit more flushing in the hopes that this won't be a problem again. In particular: 1) Do a CS stall before we emit STATE_BASE_ADDRESS since we don't really know whether or not it's pipelined. 2) Do a data cache flush in case what runs before STATE_BASE_ADDRESS is a compute shader. 3) Invalidate the state and constant caches after STATE_BASE_ADDRESS because the state may be getting cached there (we don't really know). Reported-by: Mark Janes <mark.a.janes@intel.com> Tested-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-31 18:49:44 -08:00
Jason Ekstrand	f1f9794118	anv: Flush render cache before STATE_BASE_ADDRESS on gen7 We had no good reason for not doing this on gen7 before but we didn't know it was needed. Recently, when trying update to Vulkan CTS version 1.0.2 in our CI system, Mark discovered GPU hangs on Haswell that appear to be STATE_BASE_ADDRESS related. This commit fixes them. Reported-by: Mark Janes <mark.a.janes@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>	2017-01-31 18:49:44 -08:00
Nanley Chery	33e0c5d003	anv/cmd_buffer: Use the proper depth input attachment surface state Commit `2852efcda4` moved the location of the depth input attachment surface state from the render pass to the image view, but failed to update the surface state location used when emitting the binding table. Fix this by loading the surface state from the correct location. Fixes: dEQP-VK.renderpass.formats.d16_unorm.input.* dEQP-VK.renderpass.formats.d24_unorm_s8_uint.input.* dEQP-VK.renderpass.formats.d32_sfloat.input.* dEQP-VK.renderpass.formats.x8_d24_unorm_pack32.input.* dEQP-VK.renderpass.attachment_allocation.input_output.93 dEQP-VK.renderpass.attachment_allocation.input_output.92 dEQP-VK.renderpass.attachment_allocation.input_output.82 dEQP-VK.renderpass.attachment_allocation.input_output.46 Cc: "17.0" <mesa-stable@lists.freedesktop.org> Reviewed-by: Iago Toral Quiroga <itoral@igalia.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>	2017-01-31 09:00:50 -08:00
Nanley Chery	64272d4f1b	anv: Avoid some resolves for samplable HiZ buffers v2: Simplify nested ifs (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:21 -08:00
Nanley Chery	58af615636	anv: Perform HiZ resolves only on layout transitions This is a better mapping to the Vulkan API and improves performance in all tested workloads. v2: Remove unnecessary image view aspect checks (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	2852efcda4	anv: Disable HiZ for input attachments v2 (Jason Ekstrand): - Add spec citation - Drop conditional Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	b62d8ad2ae	anv: Avoid resolves incurred by fast depth clears Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	104ce1dbab	anv: Store depth stencil layouts Store the current and requested depth stencil layouts so that we can perform the appropriate HiZ resolves for a given transition while recording a render pass. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	2e2cf78a51	anv: Add helpers to handle depth buffer layout transitions Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	462a4c9648	anv: Use the gen8 BLORP HiZ resolving function Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	3b7106c181	anv: Use gen8 BLORP HiZ clearing functions Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:20 -08:00
Nanley Chery	64fb5b0d51	anv: Enable HiZ support for multiple subpasses We'll be using layout transitions later on in the series which can occur within and between subpasses. Turn this on now to simplify the change later. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:19 -08:00
Nanley Chery	168985fca1	anv: Use ::anv_attachment_state for toggling HiZ per subpass We're about to enable HiZ support for multiple subpasses. Use this field to keep track of whether or not subpass operations should treat the depth buffer as having an auxiliary HiZ buffer. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:19 -08:00
Nanley Chery	055ff2ec52	anv: Replace anv_image_has_hiz() with ISL_AUX_USAGE_HIZ The helper doesn't provide additional functionality over the current infrastructure. v2: Add comment to anv_image::aux_usage (Jason Ekstrand) v3: Clarify comment for aux_usage (Jason Ekstrand) Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-12 20:52:19 -08:00
Nanley Chery	9f1d3a0c97	anv/cmd_buffer: Fix programmed HiZ qpitch Match the comment above the field by using units of pixels and not HiZ blocks. Cc: mesa-stable@lists.freedesktop.org Suggested-by: Jason Ekstrand <jason@jlekstrand.net> Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-11 17:35:59 -08:00
Nanley Chery	61992e0afe	anv/cmd_buffer: Fix arrayed depth/stencil attachments Enable multiple layers of the depth/stencil buffers to be accessible. Fixes the crucible test, func.depthstencil.arrayed_clear. Cc: mesa-stable@lists.freedesktop.org Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-11 17:35:59 -08:00
Kenneth Graunke	dcca706b4e	anv: Clamp depth buffer dimensions to be at least 1. When there are no framebuffer attachments, fb->width and fb->height will be 0. Subtracting 1 results in 4294967295 which is too large for the field, causing genxml assertions when trying to create the packet. In this case, we can just program it to 1. Caught by dEQP-VK.tessellation.tesscoord.triangles_equal_spacing. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2017-01-10 13:27:31 -08:00
Grazvydas Ignotas	3a1b15c392	anv: fix release build unused variable warnings Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Reviewed-by: Eduardo Lima Mitev <elima@igalia.com>	2016-12-11 20:03:14 +01:00
Jason Ekstrand	27433b26b1	anv/cmd_buffer: Actually use the stencil dimension In an attempt to fix 3DSTATE_DEPTH_BUFFER for stencil-only cases, I accidentally kept setting the SurfaceType to 2D in the stencil-only case thanks to a copy+paste error. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-11-30 17:42:42 -08:00
Jason Ekstrand	f469235a6e	anv/cmd_buffer: Remove the 1-D case from the HiZ QPitch calculation The 1-D special case doesn't actually apply to depth or HiZ. I discovered this while converting BLORP over to genxml and ISL. The reason is that the 1-D special case only applies to the new Sky Lake 1-D layout which is only used for LINEAR 1-D images. For tiled 1-D images, such as depth buffers, the old gen4 2-D layout is used and the QPitch should be in rows. Reviewed-by: Nanley Chery <nanley.g.chery@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-28 20:17:29 -08:00
Jason Ekstrand	d4ef87c1bb	anv/cmd_buffer: Set the correct surface type for depth/stencil Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-11-28 20:17:16 -08:00
Jason Ekstrand	3fd79558be	anv: Enable fast clears on gen7-8 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 14:24:29 -08:00
Jason Ekstrand	5e8069a572	anv: Add support for fast clears on gen9 Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 14:24:29 -08:00
Jason Ekstrand	8d1ccd6729	anv/cmd_buffer: Apply remaining flushes in EndCommandBuffer Otherwise, some pipe flushes may just never happen. This is unlikely to cause problems depending on how the kernel schedules batches, but we shouldn't count on it. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 14:24:29 -08:00
Jason Ekstrand	d1d6b78898	anv/cmd_buffer: Make setup_attachments take a RenderPassBeginInfo Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 14:13:53 -08:00
Jason Ekstrand	1d5ac0a462	anv: Set up binding tables and surface states for input attachments This commit adds the last remaining bits to support input attachments in the Intel Vulkan driver. For color and depth attachments, we allocate an input attachment surface state during vkCmdBeginRenderPass like we do for the render target surface states. This is so that we can incorporate the clear color and aux information as used in rendering. For stencil, we just treat it like a regular texture because we don't there is no aux. Also, only having to worry about at most one input attachment surface for each attachment makes some of the vkCmdBeginRenderPass code simpler. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:44:55 -08:00
Jason Ekstrand	57174d6042	anv/cmd_buffer: Fix pipeline barriers for input attachments We were using VK_IMAGE_ACCESS_COLOR_ATTACHMENT_READ_BIT to detect an input attachment read. We should use VK_IMAGE_ACCESS_INPUT_ATTACHMENT_READ_BIT instead. Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>	2016-11-22 13:44:55 -08:00
Jason Ekstrand	7a2cfd4adb	anv/cmd_buffer: Emit CS push constants after binding tables Emitting binding tables can cause push constants to be dirtied if the shader uses images so we need to handle push constants later.	2016-11-22 10:10:38 -08:00
Jason Ekstrand	3ef8dff865	anv/cmd_buffer: Add an assert on emit_binding_table failure The != VK_SUCCESS case is really only capable of handling the one error. This assert makes things a bit safer if something else goes wrong. Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>	2016-11-22 08:50:27 -08:00
Jason Ekstrand	f680a01ad4	anv/cmd_buffer: Emit a CS stall before setting a CS pipeline Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-22 08:06:33 -08:00
Jason Ekstrand	054e48ee0e	anv/cmd_buffer: Re-emit MEDIA_CURBE_LOAD when CS push constants are dirty This can happen even if the binding table isn't changed. For instance, you could have dynamic offsets with your descriptor set. This fixes the new stress.lots-of-surface-state.cs.dynamic cricible test. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-22 08:06:33 -08:00
Jason Ekstrand	722ab3de9f	anv/cmd_buffer: Handle running out of binding tables in compute shaders If we try to allocate a binding table and fail, we have to get a new binding table block, re-emit STATE_BASE_ADDRESS, and then try again. We already handle this correctly for 3D and blorp but it never got handled for CS. This fixes the new stress.lots-of-surface-state.cs.static crucible test. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-22 08:06:33 -08:00
Jason Ekstrand	a8b85f1f77	anv: Implement a depth stall restriction on gen7 Fixes around 60 Vulkan CTS tests on Haswell Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-20 20:40:40 -08:00
Jason Ekstrand	edb7f67bd9	anv/image: Add an aux_usage field for "default" aux Initially, the field is set to ISL_AUX_USAGE_NONE so this commit shouldn't bring any functional changes. Setting this field to something else will cause all sampled and storage image views to be created with AUX and blorp will start trying to respect it so set with care.	2016-11-17 12:03:24 -08:00
Jason Ekstrand	338cdc172a	anv: Add initial support for Sky Lake color compression This commit adds basic support for color compression. For the moment, color compression is only enabled within a render pass and a full resolve is done before the render pass finishes. All texturing operations still happen with CCS disabled.	2016-11-17 12:03:24 -08:00
Jason Ekstrand	c3eb58664e	anv/image: Rename hiz_surface to aux_surface	2016-11-17 12:03:24 -08:00
Jason Ekstrand	818c7bfb31	anv/cmd_buffer: Refactor surface state relocation handling Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	9be9f5f1c7	anv/cmd_buffer: Pull add_surface_state_reloc into genX_cmd_buffer.c Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-17 12:03:24 -08:00
Jason Ekstrand	633677194f	Allocate a null state whenever there is depth/stencil Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:32:20 -08:00
Jason Ekstrand	a380f95461	anv: Set framebuffer to NULL in secondary command buffers Nothing that is allowed to be called within a secondary now relies on the framebuffer. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-11-16 10:32:15 -08:00
Jason Ekstrand	d2b4a9da03	anv: Rework the way render target surfaces are allocated This commit moves the allocation and filling out of surface state from CreateImageView time to BeginRenderPass time. Instead of allocating the render target surface state as part of the image view, we allocate it in the command buffer state at the same time that we set up clears. For secondary command buffers, we allocate memory for the surface states in BeginCommandBuffer but don't fill them out; instead, we use our new SOL-based memcpy function to copy the surface states from the primary command buffer. This allows us to handle secondary command buffers without the user specifying the framebuffer ahead-of-time. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:07 -08:00
Jason Ekstrand	e283cd549c	anv/cmd_buffer: Expose add_surface_state_reloc as an inline helper Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:07 -08:00
Jason Ekstrand	858b75563f	anv/cmd_buffer: Use the surface state alloc helper in null_surface_state Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:11:07 -08:00
Jason Ekstrand	b3bc806855	intel/isl: Add some basic info about RENDER_SURFACE_STATE to isl_device Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-11-16 10:10:26 -08:00
Jason Ekstrand	d33e2ad67c	anv: Move INTERFACE_DESCRIPTOR_DATA setup to the pipeline There are a few dynamic bits, namely binding table and sampler addresses, but most of it is static and really belongs in the pipeline. It certainly doesn't belong in flush_compute_descriptor_set. We'll use the same state merging trick we use for gen7 DEPTH_STENCIL. Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>	2016-11-16 10:09:16 -08:00
Jason Ekstrand	623e1e06d8	anv/pipeline: Get rid of the kernel pointer fields Now that we have anv_shader_bin, they're completely redundant with other information we have in the pipeline. For vertex shaders, we also go through way too much work to put the offset in one or the other field and then look at which one we put it in later. Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>	2016-11-16 10:08:38 -08:00
Jason Ekstrand	a6c3d0f92b	anv/cmd_buffer: Enable a CS stall workaround for Sky Lake gt4 This fixes hangs in Dota2 Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>	2016-11-10 15:21:18 -08:00
Jason Ekstrand	1e3e347fd5	anv/cmd_buffer: Take a command buffer instead of a batch in two helpers Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Cc: "12.0 13.0" <mesa-stable@lists.freedesktop.org>	2016-11-10 15:21:18 -08:00
Jason Ekstrand	8b61c57049	anv: Move relocation handling from EndCommandBuffer to QueueSubmit Ever since the early days of the Vulkan driver, we've been setting up the lists of relocations at EndCommandBuffer time. The idea behind this was to move some of the CPU load out of QueueSubmit which the client is required to lock around and into command buffer building which could be done in parallel. Then QueueSubmit basically just becomes a bunch of execbuf2 calls. Technically, this works. However, when you start to do more in QueueSubmit than just execbuf2, you start to run into problems. In particular, if a block pool is resized between EndCommandBuffer and QueueSubmit, the list of anv_bo's and the execbuf2 object list can get out of sync. This can cause problems if, for instance, you wanted to do relocations in userspace. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com> Cc: "13.0" <mesa-stable@lists.freedesktop.org>	2016-11-09 11:31:12 -08:00
Jason Ekstrand	7998e37774	anv/cmd_buffer: Move descriptor flushing into genX_cmd_buffer.c It really should have gone here all along. We were trying a bit too hard to make it gen-agnostic just because it didn't have any #if's. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	1f3e6468d2	anv/cmd_buffer: Unify flush_compute_state across gens With one small genxml change, the two versions were basically identical. The only differences were one #define for HSW+ and a field that is missing on Haswell but exists everywhere else. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Jason Ekstrand	2314c9ed2e	anv/cmd_buffer: Move Begin/End/Execute to genX_cmd_buffer.c vkBeginCommandBuffer and vkCmdExecuteCommands both call into the gen-specific emit_state_base_address function and vkEndCommandBuffer belongs with begin. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>	2016-10-17 17:41:35 -07:00
Lionel Landwerlin	696f5c1853	anv: replace , with ; in anv_batch_emit() Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>	2016-10-17 18:16:38 +01:00
Jason Ekstrand	29e289fa65	anv/image: Add an isl_view to anv_image_view Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>	2016-10-14 15:39:41 -07:00
Nanley Chery	d8aacc24cc	anv: Enable fast depth clears Provides an FPS increase of ~30% on the Sascha triangle and multisampling demos. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Chad Versace <chadversary@chromium.org>	2016-10-07 12:54:18 -07:00
Chad Versace	78d074b87a	anv/cmd_buffer: Enable rendering to HiZ Nanley Chery: (rebase) - Resolve conflicts with new anv_batch_emit macro (amend) - Handle a QPitch TODO - Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems - Only use HiZ for single-subpass renderpasses - Emit the HiZ instruction before the stencil instruction to follow the optimized clear sequence specified in the PRMs - Don't modify clear params - Enable resolves when a HiZ buffer is used to ensure depth buffer validity Provides an FPS increase of ~15% on the Sascha triangle and multisampling demos. Signed-off-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Chad Versace <chadversary@chromium.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-07 12:54:18 -07:00
Jason Ekstrand	c81ec84c1e	anv/cmd_buffer: Move the clear_subpasses calls to set_subpass Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-06 16:52:31 -07:00
Jason Ekstrand	b548fdbed5	anv/cmd_buffer: Don't call set_subpass in a secondary Initially, we had intended set_subpass to be an interesting function that did whatever (presumably a lot) setup we needed for a subpass. In reality, it just sets a pointer and a dirty bit and then emits depth and stencil state. When we call BeginCommandBuffer on a secondary, there's no point in setting depth and stencil state since it will already be set by the primary. Instead, the only thing we need to do at the start of a secondary is set the subpass pointer and the dirty bit. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>	2016-10-06 16:52:31 -07:00
Jason Ekstrand	fe4e276b02	anv/cmd_buffer: Rework descriptor dirtying in set_subpass We have a DIRTY_RENDER_TARGETS flag and that makes a lot more sense than just dirtying fragment descriptors. We're checking for it in some of the gen7 code but unfortunately, nothing was setting it and it didn't do what it was supposed to do in cmd_buffer_flush_state. Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>	2016-10-06 16:52:31 -07:00

1 2 3 4

196 Commits