Most of the code in anv_meta_blit2d() is borrowed from do_buffer_copy().
Create an image and image view for each rectangle.
Note: For tiled RGB images, ISL will align the image's row_pitch up to
the nearest tile width.
v2 (Jason):
Keep pitch in units of bytes
Make src_format and dst_format variables
s/dest/dst/ in every usage
v3: Fix dst_image width
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Some fields are unnecessary. The variables "pitch" and "bs" are used
for consistency with ISL.
v2: Keep pitch in units of bytes (Jason)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
This API is designed to be an abstraction that sits between the VkCmdCopy
commands and the hardware. The idea is that it is simple enough that it
*should* be implementable using the blitter but with enough extra data that
we can implement it with the 3-D pipeline efficiently. One design
objective is to allow the user to supply enough information that we can
handle most blit operations with a single draw call even if they require
copying multiple rectangles.
This is a preparatory commit that will simplify the future usage of
this function.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
If a linear image is requested, the only possible result should be a
linearly-tiled surface.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
If a specific bit is set, the intention to create a surface with a
specific tiling format should be respected.
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
For OpenGL, see commit 9a939ebb47.
Fixes:
* dEQP-VK.compute.indirect_dispatch.upload_buffer.empty_command
* dEQP-VK.compute.indirect_dispatch.gen_in_compute.empty_command
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
While Broadwell is very good about UINT formats, HSW is more restrictive.
Neither R8G8B8_UINT nor R16G16B16_UINT really exist on HSW. It should be
safe to just use the unorm formats.
v2: Don't cast the enum to a boolean (Jason)
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
After 3ecd357d81, it may be possible for
the VS to get assigned all of the URB space.
On Ivy Bridge, this will cause the offset for the other stages to be
16, which cannot be packed into the ConstantBufferOffset field of
3DSTATE_PUSH_CONSTANT_ALLOC_*.
Instead we can set the offset to zero if the stage size is zero.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
This allows us to avoid doing some unneeded work on the meta paths where we
know that the image view will be used for exactly one thing. The meta
paths also sometimes do things that aren't quite valid like setting the
array slice on a 3-D texture and we want to limit the number of paths that
need to be able to sensibly handle the lies.
If you have an out-of-tree build, gen8_pack.h and friends will not be in
the same folder as genX_pack.h so this will be a problem. We fixed
out-of-tree earlier by adding the genxml folder to the includes for the
vulkan driver. However, this is not a good long-term solution because we
want to use it in ISL as well.
Consecutive tiles are separated by the size of the tile, not by the
logical tile width.
v2: Remove extra subtraction (Ville)
Add parenthesis (Jason)
v3: Update the unit tests for the function
Signed-off-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
Both logic and indentation suggests that the ; were not intended here.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Previously we allocated 4kB of push constant space for VS, GS, and PS
(for a total of 12kB) no matter what. This works, but doesn't fully
utilize the space - we have 16kB or 32kB of space.
This makes anv use the same method as brw - divide up the space evenly
among all active shader stages. This means HS and DS would get space,
if those shader stages existed.
In the future, we can probably do better by inspecting how many push
constants each shader stage uses, and weight things accordingly. But
this is strictly better than the old code, and ideally we'd justify
a fancier solution with actual performance data.
Rather than keeping separate {vs,hs,ds,gs}_start fields, we now store an
array indexed by the shader stage (MESA_SHADER_*). The 3DSTATE_URB_*
commands are also sequentially numbered. This makes it easy to just
emit them in a loop.
This simplifies the code a little, and also will make it easier to add
more credible HS and DS code later.
The descriptor sizes array gives the total number of each type of
descriptor that will ever be allocated from the pool, not the total amount
that may be in any particular set. In our case, this simply means that we
have to sum a bunch of things up and there we go.
We can't use a global descriptor pool like we were because it's not
thread-safe. For now, we'll allocate them on-the-fly and that should work
fine. At some point in the future, we could do something where we
stack-allocate them or allocate them out of one of the state streams.
Descriptor pools are an optimization that lets applications allocate
descriptor sets through an externally synchronized object (that is,
unlocked). In our case it's also plugging a memory leak, since we
didn't track all allocated sets and failed to free them in
vkResetDescriptorPool() and vkDestroyDescriptorPool().
I've had people ask about the design of the pack functions, for example,
why aren't we using bitfields. I wrote up a bit of background on why and
how we ended up with the current design and we might as well keep that
with the code.
As with anv_CmdCopyBufferToImage, compressed textures require special
handling during copies.
Reviewed-by: Kristian Høgsberg Kristensen <kristian.h.kristensen@intel.com>