Store the current and requested depth stencil layouts so that we can
perform the appropriate HiZ resolves for a given transition while
recording a render pass.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We'll be using layout transitions later on in the series which can occur
within and between subpasses. Turn this on now to simplify the change
later.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
We're about to enable HiZ support for multiple subpasses. Use this field
to keep track of whether or not subpass operations should treat the
depth buffer as having an auxiliary HiZ buffer.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The helper doesn't provide additional functionality over the current
infrastructure.
v2: Add comment to anv_image::aux_usage (Jason Ekstrand)
v3: Clarify comment for aux_usage (Jason Ekstrand)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Match the comment above the field by using units of pixels and not HiZ
blocks.
Cc: mesa-stable@lists.freedesktop.org
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Enable multiple layers of the depth/stencil buffers to be accessible.
Fixes the crucible test, func.depthstencil.arrayed_clear.
Cc: mesa-stable@lists.freedesktop.org
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
When there are no framebuffer attachments, fb->width and fb->height will
be 0. Subtracting 1 results in 4294967295 which is too large for the
field, causing genxml assertions when trying to create the packet.
In this case, we can just program it to 1.
Caught by dEQP-VK.tessellation.tesscoord.triangles_equal_spacing.
Signed-off-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
In an attempt to fix 3DSTATE_DEPTH_BUFFER for stencil-only cases, I
accidentally kept setting the SurfaceType to 2D in the stencil-only case
thanks to a copy+paste error.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
The 1-D special case doesn't actually apply to depth or HiZ. I discovered
this while converting BLORP over to genxml and ISL. The reason is that the
1-D special case only applies to the new Sky Lake 1-D layout which is only
used for LINEAR 1-D images. For tiled 1-D images, such as depth buffers,
the old gen4 2-D layout is used and the QPitch should be in rows.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Otherwise, some pipe flushes may just never happen. This is unlikely to
cause problems depending on how the kernel schedules batches, but we
shouldn't count on it.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
This commit adds the last remaining bits to support input attachments in
the Intel Vulkan driver. For color and depth attachments, we allocate an
input attachment surface state during vkCmdBeginRenderPass like we do for
the render target surface states. This is so that we can incorporate the
clear color and aux information as used in rendering. For stencil, we just
treat it like a regular texture because we don't there is no aux. Also,
only having to worry about at most one input attachment surface for each
attachment makes some of the vkCmdBeginRenderPass code simpler.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
We were using VK_IMAGE_ACCESS_COLOR_ATTACHMENT_READ_BIT to detect an input
attachment read. We should use VK_IMAGE_ACCESS_INPUT_ATTACHMENT_READ_BIT
instead.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
The != VK_SUCCESS case is really only capable of handling the one error.
This assert makes things a bit safer if something else goes wrong.
Suggested-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
This can happen even if the binding table isn't changed. For instance, you
could have dynamic offsets with your descriptor set. This fixes the new
stress.lots-of-surface-state.cs.dynamic cricible test.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
If we try to allocate a binding table and fail, we have to get a new
binding table block, re-emit STATE_BASE_ADDRESS, and then try again. We
already handle this correctly for 3D and blorp but it never got handled for
CS. This fixes the new stress.lots-of-surface-state.cs.static crucible test.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
Initially, the field is set to ISL_AUX_USAGE_NONE so this commit shouldn't
bring any functional changes. Setting this field to something else will
cause all sampled and storage image views to be created with AUX and blorp
will start trying to respect it so set with care.
This commit adds basic support for color compression. For the moment,
color compression is only enabled within a render pass and a full resolve
is done before the render pass finishes. All texturing operations still
happen with CCS disabled.
Nothing that is allowed to be called within a secondary now relies on the
framebuffer.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
This commit moves the allocation and filling out of surface state from
CreateImageView time to BeginRenderPass time. Instead of allocating the
render target surface state as part of the image view, we allocate it in
the command buffer state at the same time that we set up clears. For
secondary command buffers, we allocate memory for the surface states in
BeginCommandBuffer but don't fill them out; instead, we use our new
SOL-based memcpy function to copy the surface states from the primary
command buffer. This allows us to handle secondary command buffers without
the user specifying the framebuffer ahead-of-time.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
There are a few dynamic bits, namely binding table and sampler addresses,
but most of it is static and really belongs in the pipeline. It certainly
doesn't belong in flush_compute_descriptor_set. We'll use the same state
merging trick we use for gen7 DEPTH_STENCIL.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Now that we have anv_shader_bin, they're completely redundant with other
information we have in the pipeline. For vertex shaders, we also go
through way too much work to put the offset in one or the other field and
then look at which one we put it in later.
Reviewed-by: Timothy Arceri <timothy.arceri@collabora.com>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Ever since the early days of the Vulkan driver, we've been setting up the
lists of relocations at EndCommandBuffer time. The idea behind this was to
move some of the CPU load out of QueueSubmit which the client is required
to lock around and into command buffer building which could be done in
parallel. Then QueueSubmit basically just becomes a bunch of execbuf2
calls.
Technically, this works. However, when you start to do more in QueueSubmit
than just execbuf2, you start to run into problems. In particular, if a
block pool is resized between EndCommandBuffer and QueueSubmit, the list of
anv_bo's and the execbuf2 object list can get out of sync. This can cause
problems if, for instance, you wanted to do relocations in userspace.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Cc: "13.0" <mesa-stable@lists.freedesktop.org>
It really should have gone here all along. We were trying a bit too hard
to make it gen-agnostic just because it didn't have any #if's.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
With one small genxml change, the two versions were basically identical.
The only differences were one #define for HSW+ and a field that is missing
on Haswell but exists everywhere else.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
vkBeginCommandBuffer and vkCmdExecuteCommands both call into the
gen-specific emit_state_base_address function and vkEndCommandBuffer
belongs with begin.
Signed-off-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Provides an FPS increase of ~30% on the Sascha triangle and multisampling
demos.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Nanley Chery:
(rebase)
- Resolve conflicts with new anv_batch_emit macro
(amend)
- Handle a QPitch TODO
- Emit 3DSTATE_HIER_DEPTH_BUFFER on pre-BDW systems
- Only use HiZ for single-subpass renderpasses
- Emit the HiZ instruction before the stencil instruction to follow the
optimized clear sequence specified in the PRMs
- Don't modify clear params
- Enable resolves when a HiZ buffer is used to ensure depth buffer validity
Provides an FPS increase of ~15% on the Sascha triangle and multisampling
demos.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Chad Versace <chadversary@chromium.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>