Commit Graph

195 Commits

Author SHA1 Message Date
Alejandro Piñeiro 1ed8252514 v3dv/pipeline_cache: extend pipeline cache envvar
So far V3DV_ENABLE_DEFAULT_PIPELINE_CACHE allowed to configure
pipeline cache to avoid any caching using a pipeline cache.

With this change we can be more detailed. Then envvar is not anymore a
boolean. Allowed values:

  * "off": no pipeline cache at all. PipelineCache objects behaves as
    no-op objects.

  * "no-default-cache": user PipelineCache caches nir/variants, but we
    don't provide a default cache in case the user doesn't provide a
    PipelineCache object, neither for internal pipelines.

  * "full" (default): we provide a default PipelineCache, used when
    the user doesn't provide one when creating a Pipeline, and for
    internal Pipelines.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:33 +00:00
Iago Toral Quiroga f7af9eb211 v3dv: free noop job if needed when finishing the queue
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:33 +00:00
Iago Toral Quiroga deb0dce1ee v3dv: don't leak dumb BO handles allocated for swapchain images
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:33 +00:00
Iago Toral Quiroga 4acf5985a4 v3dv: hook up robust buffer access
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:33 +00:00
Iago Toral Quiroga 4823313587 v3dV: move meta init/finish to meta implementation files
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:33 +00:00
Iago Toral Quiroga db0bb36ace v3dv: don't cache subpass color clear pipelines
Subpass color clear pipelines are those used to emit partial attachment
clears as draw calls inside the render pass currently bound by the
application in the command buffer, leading to a huge performance improvement
compared to the case where we emit them in their own render pass.

Unfortunately, because the pipeline references the render pass
object in which it is used and the render pass object is owned by the
application (and can be destroyed at any point), we can't cache these
pipelines (unless we implement a refcounting mechanism or other
similar strategy).

Performance impact looks negligible based on experiments with vkQuake3,
probably because the underlying pipeline cache is preventing the
redundant shader recompiles.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:33 +00:00
Alejandro Piñeiro fa7fbdb088 v3dv/pipeline: set 16bit return_size for shadows always
So far we were pre-generating two variants, an all 16 bit return_size
and an all 32-bit return_size, as at pipeline creation time we don't
know the texture format that it would be used finally used.

But it is possible to override or at least refine the 32bit case, as
we know in advance that all shadow textures can (and in fact should)
use return_size 16bit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:33 +00:00
Alejandro Piñeiro 229cce4056 v3dv/pipeline: track if texture is shadow
To be used to decide the texture return size. We add it on the
descriptor map because it is the easier place to do so. As we are
lowering the texture accesses we can check instr->is_shadow at that
point. It is true that it is somewhat odd, as so far the descriptor
map was general-descriptor info, but is_shadow is only for
textures. But it doesn't make sense to make an effort now, as it is
possible that we would get more descriptor-specific info on the map on
the future. We can revisit that later.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:33 +00:00
Alejandro Piñeiro e8ceb8f56a v3dv/meta: fix hash table insertion
So far we were using directly the local variable key to do the
insertion, when the hash table expects a permanent address. We add a
key field on all the meta structures (that are already basically a
wrapper over v3dv_pipeline).

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:33 +00:00
Iago Toral Quiroga 29ef4ddcf9 v3dv: handle multisample rasterization with empty framebuffers
If the framebuffer has no attachments then multisample rasterization
is enabled based on the rasterizationSamples multisample state of
the pipelines. It should be noted that since we don't support
the variableMultisampleRate feature, all pipelines in the same
subpass must have matching number of samples.

V3D requires that we specifically setup our frames to enable
multisampling or not, and we do this when we create jobs inside
a subpass. Since we create the first job for a subpass as soon as
the subpas starts, this is problematic: if we don't have any
attachments, we don't won't enable MSAA at this point, but later
on we might bind an MSAA pipeline, since pipelines can be bound
at any point in the lifespan of a command buffer.

Here, we fix this by testing if the first draw call in a job uses
an MSAA pipeline but the job the was setup to not use MSAA, and in
that case we re-start the job with MSAA enabled.

We also take care of a corner case that seems to be tested by CTS
where a framebuffer with no attachments doesn't bind any pipelines
with MSAA enabled (so according to the Vulkan spec, multisample
rasterization must be disabled) but the fragment shader in use
reads gl_SampleID (which enables per-sample shading). This would
lead to enabling per-sample shading with single-sample rasterization,
which doesn't make sense and makes the simulator complain, so we just
disable per-sample shading in that case.

Fixes:
dEQP-VK.pipeline.multisample.mixed_count.*

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro 5de282b50a v3dv/descriptor: remove v3dv_descriptor_map_get_image_view
Now that we added support for texel_buffers, on all the cases that we
were checking for a image_view we end checking for a image_view or
buffer_view, so we stopped to use it. Remove it as it become
superfluous.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Iago Toral Quiroga 4a63b1ae87 v3dv: handle multisample resolves for formats that don't support TLB resolves
The TLB multisample resolve feature is only limited to specific format types.
For everything else, including sfloat and integer formats, we need to
fallback to a blit resolve. This needs to be handled both for in-pass
resolves as well as for vkCmdResolveImage.

Because these blits would happen after the tile store operations, we need
to make sure we store the multisampled buffers so we can then read them for
the blit resolve.

Fixes the remaining test failures in:
dEQP-VK.renderpass.suballocation.multisample_resolve.*

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro 947fa7033d v3dv: add v3dv_limits file
There are several definitions for hw limits on v3dv_image that we want
to share, but v3dv_private was already growing bigger and messier.

So let's move them to a specific header. Note that there is already a
broadcom/common/v3d_limits.h. We are not putting them there because
right now they are only used by the Vulkan driver, but are candidates
to be moved.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro 81b713e341 v3dv/descriptor: support for UNIFORM/STORAGE_TEXEL_BUFFER
This gets passing most uniform/storage_texel buffer tests.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Iago Toral Quiroga 3310c1542c v3dv: handle multisampled image copies with the blit path
This should be able to handle partial copies of multisampled images.

This change extends our blit shader interface to also handle multisampled
destinations so that if the blit destination is a multisampled image,
the blit will rely on sample rate shading to copy all samples from
the source image (which must have a matching number of samples).

I have not found any tests in CTS that do partial copies of
multisampled images, so I tested this with a full multisampled image
copy, using this test:
dEQP-VK.api.copy_and_blit.core.resolve_image.whole_copy_before_resolving.4_bit

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Iago Toral Quiroga f219795a26 v3dv: add a blit fallback path for vkCmdResolveImage
This fallback is required when we have to do partial resolves. It
works the same way as other blit fallbacks for copy operations: it
will bind the source image as a source texture and blit the selected
region to the destination image.

The difference in this case is that the source image is multisampled
and the blit shader needs to fetch and average individual samples for
each texel.

This gets us to pass all the remaining test cases in
dEQP-VK.api.copy_and_blit.core.resolve_image.*

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Iago Toral Quiroga d953eab5af v3dv: process VkPipelineMultisampleStateCreateInfo properly
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Iago Toral Quiroga d87941cb3a v3dv: consider MSAA when computing frame tiling
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro d64ff26563 v3dv/pipeline: pre-generate more that one shader variant
In order to reduce the number of shader builds after pipeline creation
(that ideally shouldn't happen) we pre-generate two shader variants at
pipeline creation time. In addition to the default one, that set the
return size for all texture to 16 bit, we build another variant
setting the return size for all textures to 32-bit. cmd buffer selects
the latter if any of the textures requires 32bit.

So we are using an all 16-bit return size or an all 32-bit return size
variants. This could be slightly improved by pre-generating return
size combinations if the texture number is below a threshold. But that
would require more space, and bigger pipeline creation time, so would
need to be evaluated.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro a87d2c7806 v3dv/pipeline: when looking for a variant, check first current variant
So far, when checking for a variant fulfilling a specific v3d key, we
were checking the caches, and if that failed, we compiled a new
variant, and update the current variant.

But we could check first if the current variant fullfils that. This
was not really problematic so far, as checking on the caches was fast,
but now that we could be without any kind of shader cache using
V3DV_ENABLE_PIPELINE_CACHE, it is far better to check first current
variant.

Without this vkQuake3 at 720p drops to 1fps when disabling the cache.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro 62c32d6ca0 v3dv/pipeline: remove custom variant cache
Now that we have a default pipeline cache, we can rely on it. This
allows to remove some code, and avoid the need to have a cache per
each pipeline stage.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro 35381a4696 v3dv/pipeline_cache: add default pipeline cache
That it would be used as fallback. Three advantages:

  * Having a cache for user operations even if the user doesn't
    provide it.

  * Having a cache for internal operations. v3dv_meta_copy creates
    pipelines for some copy path, so it is interesting to have them
    cached.

  * Testing: so now the pipeline cache is tested by more CTS tests.

As any other pipeline cache, it can be disabled with the
V3DV_ENABLE_PIPELINE_CACHE. It was suggested that would make sense to
have a specific envvar for the default pipeline cache, but for now
just one envvar is enough.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro 48a64f28c2 v3dv/pipeline: provide a shader_sha1 to private ShaderModules
So far for private pipelines we were creating dummy shader modules
where we directly provided the nir shader. But for the pipeline cache
we were using the SPIR-V to generate part of the cache key sha1.

The main use case for private pipelines are meta_copy/clear. Those nir
shaders depend on parameters like the format etc, so we use directly
the serialized form of the NIR shader to generate the sha1.

The other case are the no-op fragment shader that we need to provide
if no fragment shader is defined by the user. For that case we can
just use the default shader name, as the no-op shader is always the
same.

This is required as we plan to add a default pipeline cache, that
would include our private shaders too.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro a00fe4cb0c v3dv/pipeline_cache: cache v3dv_shader_variants
This also includes being able to serialize them as part of
GetPipelineCacheData and to deserialize it as part of
CreatePipelineCache.

So now we can also upload the assembly of the variant as part of the
PipelineCache creation.

Note that from all this the tricky part was the prog_data
serialization. v3d_compile allocates and fill a new prog_data, with
rzalloc. Among other things because it also allocates internally the
uniform list. So we needed to replicate that when deserializating the
prog_data. Ideally we would like to avoid that, and allocate as much
resources as possible using vk_alloc, but that would mean a somewhat
deep change on the v3d_compiler, that we want to avoid as much
possible for now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro 63b6b633e9 v3dv/pipeline: add basic ref counting support for variants
As soon as we start to cache variants on pipeline caches, the same
variant could be used by different pipelines and pipeline caches.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro 2326d5bc04 v3dv/pipeline_cache: cache nir shaders
Heavily based on anv nir caching. One of the bigger difference is that
we don't create the nir shader using a ralloc_context local to the
main compile graphics method. On anv, after compiling the shader, they
discard the nir shader. We need it as we could need it to build shader
variants later.

As anv, we introduce a environment variable to disable the cache:
  V3DV_ENABLE_PIPELINE_CACHE

By default is enabled. The main purpose for this envvar is debugging,
in order to provide a easy way to discard a bug on the cache.

It is pending to serialize/deserialize the NIR shaders as part of
GetPipelineCacheData and PipelineCacheCreate. We also plan is to cache
too shader variants. We would do that on following patches.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro 1d2ae8756f v3dv/pipeline_cache: bare basic support for pipeline cache
And this means providing a proper cache object, and being able to
load/retrieve a cache data with a proper header. Not really caching
anything yet. That would be tackle on following patches.

Note that this no-op cache got all the specific pipeline_cache and
pipeline.cache tests passing on the rpi4.

The following tests are still crashing when using the simulator:
 dEQP-VK.synchronization.internally_synchronized_objects.pipeline_cache_compute
 dEQP-VK.synchronization.internally_synchronized_objects.pipeline_cache_graphics

But those are an issue of synchronization tests on the simulator, and
not related with the pipeline cache itself. In general synchronization
tests should be tested on the rpi4.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Alejandro Piñeiro ffaab5593c v3dv/device: add vendorID/deviceID get helpers
As we would need them for the pipeline cache header.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Iago Toral Quiroga 47e02a2ef1 v3dv: add a fast path for vkCmdClearAttachments
Since vkCmdClearAttachments executes inside a render pass, we would
benefit from converting it to a draw within the current subpass job to
improve batching and avoid expensive tile load/store operations.

This can dramatically improve performance for applications using this
command, however, we can only use this if we are clearing the base
layers of framebuffer attachments, since otherwise we would need to
use layered rendering, which we don't support yet.

This improves vkQuake3 performance dramatically (almost 100%
performance improvement at 1080p), which calls this twice per frame.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Iago Toral Quiroga add8778638 v3dv: ignore stencil load operation if attachment format doesn't have stencil
This gets vkQuake to render correctly, which creates render passes with
stencil load operations even when the depth/stencil attachment format
doesn't have a stencil aspect. While this is a bit weird, it seems to
be allowed by the spec:

   "If the format has depth and/or stencil components, loadOp and storeOp
    apply only to the depth data, while stencilLoadOp and stencilStoreOp
    define how the stencil data is handled."

In our case we were not ignoring it and this was causing that we emitted a
Z buffer load that seemed to clobber the Z clear, preventing all draw calls
from passing the depth test.

While we are at it, also change the depth/stencil store operation (which
was already handling this scenario correctly) to use the format of the
render pass attachment description rather than the underlying image
format.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:32 +00:00
Iago Toral Quiroga 0db95de577 v3dv: improve pipeline barrier handling
So far we have been getting away with finishing the current job in the
presence of a pipeline barrier and relying on the RCL serialization,
but of course this is not always enough.

This patch  addresses synchronization across different GPU units
(i.e. draw indirect after compute), as well as cases where we need to
sync before binning.

Fixes CTS failures in:
dEQP-VK.synchronization.op.single_queue.barrier.*

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga a2538b2520 v3dv: make sure we emit vertex attributes in location order
The order in which we emit the attributes is relevant, since
GL_SHADER_STATE_ATTRIBUTE_RECORD packets don't include an explicit
attribute index. This means that we need to emit them in driver
location order, since the compiler uses that location to compute
attribute offsets in the VPM.

Fixes ~1300 CTS tests in:
dEQP-VK.pipeline.vertex_input.multiple_attributes.out_of_order.*

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga 3bd02a729a v3dv: fix color border clamping with specific formats
For some reason, CTS expects E5B9G9R9 and B10G11R11 with
transparent black border clamping produce alpha 1 instead of 0.

Since border color takes precedence over the texture state swizzle,
the only way to fix this is to lower the texture swizzle in the shader
to set alpha to 1.

Fixes:
dEQP-VK.pipeline.sampler.view_type.*b10g11r11*clamp_to_border_transparent_black
dEQP-VK.pipeline.sampler.view_type.*e5b9g9r9*.clamp_to_border_transparent_black

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga f156c5dc4d v3dv: fix regressions for cubemap array load/store
It seems that we only want to set the texture state's depth to the
number of 2D layers divided by 6 when sampling, not wen doing
load/store.

This means that we need to generate two different states and choose
the one to use depending on the descriptor.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga cb1e492ee0 v3dv: handle unnormalized coordinates in samplers
In OpenGL, unnormalized coordinates are implicit based on the sampler
type (rectangle textures), so the compiler can set the flag when needed.
In Vulkan, however, this is configured explicitly in the sampler object,
so the compiler won't set it and we need to do it manually when we are
writing the P1 uniform.

Fixes:
dEQP-VK.pipeline.sampler.exact_sampling.*.unnormalized_coords

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga 6053e85ee4 v3dv: fix textureSize() for cube arrays
For these we want to divide the number of layers by 6.

Fixes:
dEQP-VK.glsl.texture_functions.query.texturesize.*samplercubearray*

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga 8116d65fbc v3dv: improve handling of too large image sizes
Instead of asserting that users don't try to create images that
would require 4GB+ of memory, error out with the corresponding
OOM error when the user tries to actually allocate the memory
for the image.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga 91907560d5 v3dv: implement support for shader spilling
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga e59e706928 v3dv: don't reset descriptor state after a meta operation
If the meta operation did not change descriptor state then we should keep it,
not reset it.

Fixes:
dEQP-VK.fragment_operations.early_fragment.early_fragment_tests_stencil
dEQP-VK.fragment_operations.early_fragment.no_early_fragment_tests_stencil

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga d809d9f3f6 v3dv: don't free BOs from imported memory objects
Only free the underlying BO when the exported memory object is freed
to avoid multiple frees of the same memory.

The only exception is winsys BOs where we import a BO created in the
display device into the render device. In this case, we only have one
memory object referencing the BO and we want to destroy it with that
memory object.

Fixes:
dEQP-VK.api.external.memory.dma_buf.*
dEQP-VK.api.external.memory.opaque_fd.*

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga b356d3de8c v3dv: implement indirect compute dispatch
The hardware can't do this, so we need to record a CPU job that will
map the indirect buffer at queue submission time, read the dispatch
parameters and then submit a regular dispatch.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga 1d6edcc3e8 v3dv: always emit index buffer state for new jobs
New jobs need to re-emit all state. Typically, this is achieved
by resetting all dirty state flags when we start a new job, but
for index buffers we were not using a dirty bit because we always
emit them immediately. This patch adds the bit and only tries
to skip index buffer state if the bit is not dirty, which will
ensure that we will always emit it for new jobs.

This fixes a regression in the shadowmapping demo from Sascha Willems
introduced with "v3dv: try harder to skip emission of redundant state".

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga 2f1c15116f v3dv: handle unsized arrays in SSBOs
CTS coverage for this was hiding behind compute shaders so
we didn't notice this was not working properly until now.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga 7e990683fd v3dv: implement compute dispatch
for now this only implements regular dispatches, not indirect.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga d0b1bb3032 v3dv: handle separate binding points for compute and graphics
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga 67d5b0c91f v3dv: support compute pipelines
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Alejandro Piñeiro f78c99f357 v3dv/bo: add a maximum size for the bo_cache and a envvar to configure it
V3DV_MAX_BO_CACHE_SIZE can be used to configure it.

So one way to disable the bo cache is setting V3DV_MAX_BO_CACHE_SIZE
to zero. This would still run all the bo_cache size checks, but having
another envvar just to ensure that anything related to the cache is
used seemed like an overkill.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Alejandro Piñeiro 2adea940f1 v3dv/bo: adding a BO cache
Heavily based on the already existing for the v3d OpenGL driver, but
without references, and with some extra OOM checks (Vulkan CTS has
several OOM tests).

With this commit v3dv_bo_alloc and v3dv_bo_free became frontends to
the bo_cache. The former tries to get a BO from the cache if possible,
and the latter stores the BO on the cache if possible. The former also
adds a new parameter to point if the BO to allocate is private.

As v3d we are only caching private BOs, those created by the driver
for internal use (like CLs, tile_alloc, etc). They are the ones with
the highest change of being reused (for example, CL BOs are always
4KB, so they can always be reused). User-created BOs can have any
size, including some very large ones for buffers and images, which
makes them far less likely to be reused and would add a lot of memory
pressure if we decided to cache them.

In any case, in practice, we found that we could get a performance
improvement by caching also user-created BOs, but that would need more
care and an analysis to decide which ones makes sense. Would also
require to change how the cached BOs are stored by size. Right now
there are an array of list_head, that doesn't work well with big
BOs. If done, that would be handled on a separate commit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga 1f8343b875 v3dv: add a CPU path for buffer to image copies
The blit shader path for buffer to image copies is pretty bad,
since it needs to produce a tiled image from the linear buffer
prior to emitting the blit copy.

This patch adds a new preferential path where we implement the
copy using the CPU, similar to what the GL driver does for
texture uploads. This makes vkQuake2 at least 4x faster when
dynamic lights are enabled (which triggers dynamic texture
updates).

We also tested a GPU path where we use a shader that takes the
linear buffer as a UBO and copies directly from it. This also
shows a clear performance gain, but still worse than the CPU
implementation.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00
Iago Toral Quiroga e1c8041cde v3dv: try harder to skip emission of redundant state
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6766>
2020-10-13 21:21:31 +00:00