So far we have been caching the first pipeline we produced and always
reusing that, which is obviously incorrect.
This change implements a proper cache and also takes care of releasing
the cached resources when the device is destroyed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This is achieved by rendering a quad in the clear color for each layer
of each attachment being cleared. Right now we emit each clear in a
separate job with a single attachment framebuffer, but in the future
we may be able to extend the solution to using multiple render targets
and clear multiple attachments with a single job.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Specially after CmdBindDescriptorSets, it is likely that we would need
a new shader variant, like for example if sampler descriptor sets are
bound.
At that moment a new v3d key is populated, using as base the one used
at pipeline creation, so only cmd_buffer depending values are changed.
Then a new variant is requested. Note that internally it is handled
with a cache, so no new compilation will be done if not needed.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
So far, we were doing the compilation to qpu when the pipeline was
created (as part of vkCreateGraphicsPipeline).
But this would not be correct when some specific descriptors are
involved, like textures. For that case some nir lowerings depend on
the texture format, and that info is not available until the specific
descriptors are bound to the command buffer. In the same way, the same
command buffer with a given pipeline could get their descriptor bound
again.
So it would be needed to support compilation variants of the same
shader. So finally, the v3d_key would work as keys, as the variants
would be tracked with a hash table.
This commit introduces the new structures for that. What we were
building as the final qpu shader would become the initial default
variant for the pipeline. We are also saving the keys used at that
point, to avoid needing to fully regenerate them when a new variant is
created. Not just for performance, but also to avoid needing to track
the graphics pipeline create info structure.
The code to handle updating the current variant would be done on
following commits.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This include SAMPLER, COMBINED_IMAGE_SAMPLER and SAMPLED_IMAGE
descriptors.
In order to support them we do the pre-packing of TEXTURE_SHADER_STATE
and SAMPLER_STATE when Images and Samplers (respectively) are
created. Those packets doesn't need to be tweaked later, so we upload
them to an bo.
A possible improvement of this would be that the descriptor pool
manages a bo for all descriptors, that suballocate for each descriptor
allocated. This is what other drivers do (and as far as I understand,
one of the reasons of having a descriptor pool).
Immutable samplers are not supported, will be handled on a following
patch.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
v3dv_descriptor is going to be expanded with more data, so it doesn't
make sense anymore to handle a fake descriptor for the push
constants. Introducing a new struct, that is just a pair
bo/offset. Initially named v3dv_resource, as it could be the base to
reuse bos for different resources (like assembly bo)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Not quite sure why this is required though. Conversion from/to
sRGB happens on tile loads and stores, with the tile buffer
being always linear, so there should be no difference.
Fixes all test failures in:
dEQP-VK.pipeline.blend.format.r8g8b8a8_srgb.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
In this mode, which can be activated with V3D_DEBUG=always_flush like
in the GL driver, we flush every draw call separately. For now this
is useful for debugging, but we can also set the flag internally on
specific jobs when we identify scenarios where we need the same behavior.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
It is valid to submit with an empty list ofcommand buffers, however,
we still need to wait on the pWaitSemaphores provided and only signal
the pSignalSemaphores and fence once we have finished waiting on them
to honor the semantics of the submission.
Because waiting and signaling happens in the kernel, the easiest way
to do this is to submit a trivial no-op job to the GPU. To do this,
we need to refactor some of our code so that code that might have been
operating on a command buffer starts operating on a job instead, so we
can resuse most of our infrastructure to create the no-op job.
Additionally, because no-op jobs are created internally by the driver,
we are responsible for destroying them too. For this, we bind a fence
to each no-op job we submit and we test for completion of in-flight
no-op jobs (and destory them if completed) every time vkQueueSubmit
is called.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
There are a handful of tests that simulate 'out of memory' situations
during swapchain image creation, and these can lead to failed job
allocations when the driver is running on the prime blit path, as that
involves creating a command buffer. The tests expect us to handle this
scenario gracefully and return an appropriate OOM error as a result.
This make sure we don't try to dereference a job if we failed to allocate
it so we don't crash and can return the OOM error gracefully in the
process.
Fixes:
dEQP-VK.wsi.xlib.swapchain.simulate_oom.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Push constant tests were still working without taking this into
account because we can't really fine graine how much we allocate for
them.
For now we only use them to know if we should allocate/fill the ubo
for push constants or not.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This puts all the information required to setup frame tiling into
v3dv_frame_tiling so we no longer need a framebuffer to start a
frame. This makes the code simpler, since frame tiling calculations
happen automatically when we start a new frame and simplifies
the implementation of copy and clear operations that used to
requiere that we setup a fake framebuffer with no actual attachments,
which was a bit of a kludge.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
So far we have been getting away with computing frame tiling information for
the framebuffer object, but that is not correct, since different subpasses
may access different subsets of the framebuffer, with each requiring a
different configuration because the number of render targets and the maximum
bpp can change for each subpass.
This adds a v3dv_frame_tiling struct to keep the frame tiling information and
rewrites the code to compute this for every new job we start.
Fixes a bunch of tests in dEQP-VK.pipeline.render_to_image.*
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
When we create a new job for a new subpass, we might have to finish
the current job for the previous subpass, so it is important that we
we don't get ahead of ourselves and increment the current subpass index
in the command buffer state until we are really done finishing the
previous job.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We were storing the format of the base miplevel in the image view and
we were typically using that instead of the taking the format from
the appropriate image slice. This was a problem when loading or storing
a miplevel other than the base which happened to have a different format.
This also removed the tiling field from the image view to avoid repeating
the same mistake in the future.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
I am not quite certain that this is the way to go though. Here, we are
expecting that the GPU can set/reset the event inside a command buffer
as a 1x1 pixel clear for example, however, there is still the question
of how we get to implement the command buffer wait on an event, since
reading the docs I haven't found any such functionality to be available.
We could think of implementing this by splitting the command buffer
into multiple jobs at the wait command, and then using a separate
thread for job submissions that would poll the event UBO before sending
it to the kernel, but that looks like a bit of a kludge.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
By default they are trivially lowered to load_uniform.
We still need to allocate an UBO for push constants, used for those
that are accessed using a non-const index. This is automatically
handled by the compiler, as it cames back as asking a
QUNIFORM_UBO_ADDR. This is what already does for gallium.
Note that if needing the UBO, we are uploading the full push constant
data. An improvement would be to try to upload only the data that
needs to rely on the UBO (so non-const accesses to uniforms).
Also, the code is not handling getting out of space from the UBO
bo. This would be tackled at a different commit.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Some of these may need additional work to work for real, but we should
be able to support them.
We also include some formats that are not supported for images, but
that we want to support for buffers, such as R32G32B32 for a vertex
buffer. In the future we might want to expand the format table to
specify which formats are supported for buffers.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
From the Vulkan spec:
"When an image view of a depth/stencil image is used as a
depth/stencil framebuffer attachment, the aspectMask is ignored
and both depth and stencil image subresources are used."
So in that scenario, we ignore the aspect mask on the view and go
check the actual format of the underlying image to decide if we
have depth or depth+stencil aspects.
This gets VkRunner's depth-buffer.shader_test to pass.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
For now this only implements a fast path using the tile buffer, so it
can only be used when clearing full images, but this is good enough
for VkRunner.
The implementation is a bit tricky because this command executes
inside a render pass, and yet, since we are using the tile buffer to
clear, this needs to go in its own job. This means that with this, we
need to be able to split a subpass into multiple jobs which creates
some issues.
For example, certain operations, such as the subpass load operation
(particularly if it is a clear) should only happen on the first job of
the subpass and subsequent jobs in the same subpass should always
load.
Similarly, we should not discard the last store on an attachment
unless we know it is the last job for the last subpass that uses the
attachment.
To handle these cases we add two new flags to the job, one to know if
the job is not the first in a subpass (is_subpass_continue) and
another one to know if a job is the last in a subpass
(is_subpass_finish).
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
For that we include the array_index when asking for a ubo/ssbo index
from the descriptor_map.
Until now, array_index was not included, but the descriptor_map took
into account the array_size. This had the advantage that you only need
a entry on the descriptor map, and the index was properly return.
But this make it complex to get back the set, binding and array_index
back from the ubo/ssbo binding. So it was more easy to just add
array_index. Somehow now the "key" on the descriptor map is the
combination of (set, binding, array_index).
Note that this also make sense as the vulkan api identifies each array
index as a descriptor, so for example, from spec,
VkDescriptorSetLayoutBinding:descriptorCount
"descriptorCount is the number of descriptors contained in the
binding, accessed in a shader as an array"
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Focused on getting the basic UBO and SSBO cases implemented. So no
dynamic offset, push contanst, samplers, and so on.
This include a initial implementation for CreatedescriptorPool,
CreateDescriptorSetLayout, AllocateDescriptorSets,
UpdateDescriptorSets, CreatePipelineLayout, and CmdBindDescriptorSets.
Also introduces lowering vulkan intrinsics. For now just
vulkan_resource_index.
We also introduce a descriptor_map, in this case for the ubos and
ssbos, used to assign a index for each set/binding combination, that
would be used when filling back the details of the ubo or ssbo on
other places (like QUNIFORM_UBO_ADDR or QUNIFORM_SSBO_OFFSET).
Note that at this point we don't need a bo for the descriptor pool, so
descriptor sets are not getting a piece of it. That would likely
change as we start to support more descriptor set types.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The general idea is that we always emit from our dynamic state, and when
a particular piece of state is not dynamic, we just set our dynamic state
from the pipeline state, however, the implementation was quite confusing:
the mask of dynamic states flagged states that were not dynamic and some
places woud mix dirty flags and dynamic state flags. We also were not
updating the dynamic state mask in the command buffer, etc.
This patch, hopefully, simplifies all this and makes it less confusing,
starting by making the dynamic state mask flag dynamic states, fixing
the places where we would confuse dirty state flags with dynamic state
flags, making sure that our command buffer state is setup correctly
and that we only emit state when it is actually dirty.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We were not computing viewport transform for static viewports provided with
the pipeline state. Also, we were not copying the transform into the command
buffer state when we bound the pipeline.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Otherwise we would lose updates relevant to subsequent subpasses in
the same renderpass that read or partially write the attachment.
The only scenario where we can safely do this is on the last subpass
that uses the attachment, so long as we don't need to emit the store
for the clear.
This also fixes a bug in the computation of the first subpass that
uses an attachment.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This ignores stencil for now and focuses on depth testing without
support for early depth testing.
To implement this we need to start considering how many of our
framebuffer attachments are color attachments, since some of the
computations we use to determine tile sizes and binning configuration
depend on this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
When running on the real hardware we have two devices: the v3d render
node and the vc4 display node. We need the latter to allocate
winsys BOs for v3d to render into. Since exporting these BOs is
a privileged operation, we need to obtain the fd for this device
through the display server. For now we only support doing this through
the XCB DRI3 platform.
Also, do not duplicate or re-open the DRM devices when creating logical
devices. The simulator checks that the file descriptor is exactly
the same we used to initialize it when we created the physical device
and aborts if it sees a different fd number, even if it points to the
same device.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This includes:
* Implementation for CmdBindVertexBuffers
* Gather vertex input info during CreateGraphicsPipelines
(pipeline_init) and SHADER_STATE_ATTRIBUTE_RECORD prepacking
* Final emission of such packet during CmdDraw
(cmd_buffer_emit_graphics_pipeline)
Default attributes values will be handled on a following patch.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We can't prepack all the record, as addresses need the job, and
uniforms depend on dynamic value.
Also due cl_emit_with_prepacked and v3dv_pack asserting correct
values, we need to define two values twice, that lead to move
vpm_config to the pipeline. In any case, the latter will be useful
when we start to prepack more stuff.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Generally, we can do this when they render to the same collection of
attachments and we only need to emit a single RCL for them.
To implement this, we need to track the first subpass that is included
in the job and rewrite our loads and stores in the RCL to refer to that
subpass instead of the current subpass (which would be the last included
in the RCL).
When we merge jobs we also reuse the tile state/alloc BOs and we only
emit the binning setup once.
The environment variable V3DV_NO_MERGE_JOBS can be set to disable
job merging and have each subpass be in a separate job. This can be
useful for debugging issues spawning from incorrect subpass merges.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
As we make progress towards more complex submissions we will need to split
our command buffers into smaller executable units (jobs) that we can
submit indepdently to the kernel. This will be required to implement
pipeline barriers, split subpasses that have depedencies on previous
subpasses, split render passes that use more than 4 render targets, etc.
For now we keep things simple and we only keep one job as current
recording target in the command buffer, and we generate a new one
with every subpass or with any commands we see outside of a render pass
(only vkCopyImageToBuffer for now). In the future we probably want to
optimize this by merging subpasses into the same job when possible,
etc.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
We will use this when we implement copying images to buffers using the
TLB, where we'll need to setup a framebuffer and tiling configuration
for the TLB store to the destination buffer.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Initial port of the equivalent v3d_write_uniforms, to be used by the
cmd_buffer when emitting the drawing packets.
Initially doesn't include all the quniform types, only those needed by
the initial basic vulkan tests.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Before that commit we were calling get_viewport_xform to get those
values twice (to emit scissor and viewport), and we found that we
would need that info even more times. So let's just compute that info
when setting the viewport, and reuse the values.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Values still doesn't take into account having vertex elements data,
but keeps some of that half-done code in comments. It would be better
to do that when we get an example using it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Starting with Viewport/Scissor data from VkGraphicsPipelineCreateInfo.
Note that initially this can be somewhat counter-intuitive. What we
are really doing it is filling up the structs with the dynamic stuff
from the pipeline, when such is not defined as dynamic. This is what
anv/radv does, and basically means that we treat both in the same way,
so easier after that.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
The basic to get the spirv built to nir, including calling some common
nir passes. Pending deep review if all those are needed or if we miss
some, but for that it would be better to be able to run existing
tests.
Enough to get assembly generated for simple tests.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
So we have a lower level representation of a buffer object that we can
manipulate that is not tied to a Vulkan representation of memory. This
will be useful as we start allocating driver internal buffers, such as
command lists.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
This relies heavily in infrastructure taken from the v3d driver. We should
probably look for ways to share the code between both drivers by creating
a surface layout library that we can use from both, or at least moving
parts of the v3d driver to broadcom/common. Specifically:
We take v3d_tiling.c, which requires gallium's pipe_box type for some
helper functions that we don't quite need yet.
We copied and adapted bits of v3d_resource.c into v3dv_image.c, however,
it should be possible to look for ways to reuse the code instead of
duplicating it.
Pre-compute UIF padding into the slice setup. This is different from
what we do in v3d (we do this at cerate_surface time), but it is
more convenient for us to pre-calculate it here for all mipmap
slices.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
For now we are only interested in being able to include the header
file for format definitions, so this is enough. When we start actually
emitting packets we will need to provide proper hooks.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
Initial commit, mostly a import of the minimum from anv/radv to get a
skeleton to start to work with.
In includes:
* meson files
* Copy & adapt entrypoints ane extensions scripts from anv (that were
later used on radv)
This is a firt approach, but is is likely that we can remove/simplify
some things.
v2: fix copyright character at broadcom/vulkan/meson.build (Eric)
v3: no spaces inside arrays (Dylan)
v4: add gnu_symbol_visibility (detected by CI on first Merge attemp)
Reviewed-by: Eric Anholt <eric@anholt.net>
squash! v3dv: add v3d vulkan driver skeleton
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>