Move away from using gpu_id as the primary means to identify which
adreno we are running on, as future GPUs (starting with 7c3) stop
providing a gpu_id as a new naming scheme is introduced.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12159>
... and reduce maxDescriptorSetUpdateAfterBindStorageBuffersDynamic from 12 to
8.
MAX_DYNAMIC_BUFFERS is MAX_DYNAMIC_UNIFORM_BUFFERS +
MAX_DYNAMIC_STORAGE_BUFFERS. We set
maxDescriptorSetUniformBuffersDynamic = MAX_DYNAMIC_UNIFORM_BUFFERS
maxDescriptorSetStorageBuffersDynamic = MAX_DYNAMIC_STORAGE_BUFFERS
maxDescriptorSetUpdateAfterBindUniformBuffersDynamic = MAX_DYNAMIC_BUFFERS / 2
maxDescriptorSetUpdateAfterBindStorageBuffersDynamic = MAX_DYNAMIC_BUFFERS / 2
The CTS test checks that
maxDescriptorSetUpdateAfterBindUniformBuffersDynamic
- is at least 8; and
- is at least maxDescriptorSetUniformBuffersDynamic
maxDescriptorSetUpdateAfterBindStorageBuffersDynamic
- is at least 4; and
- and is at least maxDescriptorSetStorageBuffersDynamic
Prior to this patch maxDescriptorSetUpdateAfterBindUniformBuffersDynamic was 12
but maxDescriptorSetUniformBuffersDynamic was 16, thus causing the CTS failure
in
dEQP-VK.api.info.vulkan1p2_limits_validation.ext_descriptor_indexing
By raising maxDescriptorSetUpdateAfterBindUniformBuffersDynamic to the same
value as maxDescriptorSetUniformBuffersDynamic, we bring the limits into the
appropriate ranges. We do the same thing for
maxDescriptorSetUpdateAfterBindStorageBuffersDynamic by assigning it the same
value as maxDescriptorSetStorageBuffersDynamic.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12193>
Blob advertises { 1024, 1024, 64 }, but from tests they all
could be 1024.
Fixes tests:
dEQP-VK.compute.basic.max_local_size_x
dEQP-VK.compute.basic.max_local_size_y
dEQP-VK.compute.basic.max_local_size_z
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9409>
This is much more maintainable, extensible, and easy to read than
hand-rolled structs approximating assembly. This also removes the last
use of the old hand-written packing structs. There are a few minor
differences:
- The shaders are larger because ir3 currently doesn't support (rpt),
which means that some shaders are larger than one instrlen and the
current logic has to be extended to allow for that. This seems a small
price to pay, ir3 will gain support for (rpt) eventually, and we
shouldn't have limitations like this baked in anyway. For example some
GL blob r8g8 <-> r16 copy shaders are apparently quite large.
- Due to the inability to switch inputs/outputs on the fly, we need to
split the VS into two variants. I made the layer-writing variant also
used for other clears, because the old method of overloading c0.z/c1.z
to mean both "src x coordinate" and "z clear value" in the same shader
seemed too clever and I didn't want to add yet another variant. This
means that non-layered clears will also write the layer (to 0), but
that shouldn't be a big deal performance-wise.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12079>
v1. Hyunjun Ko <zzoon@igalia.com>
- Add to hanlde VK_DESCRIPTOR_POOL_CREATE_HOST_ONLY_BIT_VALVE
- Don't support VK_DESCRIPTOR_TYPE_INPUT_ATTACHMENT and
VK_DESCRIPTOR_TYPE_COMBINED_IMAGE_SAMPLER
v2. Hyunjun Ko <zzoon@igalia.com>
- Fix some indentations and nitpicks.
- Add the extension to features.txt
v3. Hyunjun Ko <zzoon@igalia.com>
- Remove unnecessary asserts.
Signed-off-by: Eduardo Lima Mitev <elima@igalia.com>
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9322>
This way we can make the tables const. At the same time, for a6xx, this
introduces a "sub-generation template" to reduce the copy/paste for
parameters which are keyed to the sub-generation. It also explicitly
lists every supported GPU, to get rid of duplicate lists of supported
gpus between the device-info and drivers.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11790>
Replace it with a calculation which works for all current GPUs.
Duplicated the calculation in both drivers because freedreno_dev_info isn't
meant for derived parameters (and drivers might want to just calculate on
the fly instead).
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11790>
There are some cases using kgsl backend on linux that is still not usual
setup though, we need to consider too.
Regarding the timeline semaphore feature, we could implement it for
the kgsl backend in the future, and probalby it should be using the
existing code in tu_drm.
See #4738, #4907
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11488>
Move and rename warn_non_conformant_implementation() to common location
of src/vulkan/util/vk_util.c as vk_warn_non_conformant_implementation().
In freedreno/ci, move MESA_VK_IGNORE_CONFORMANCE_WARNING to common
location of .baremetal-deqp-test-freedreno-vk.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11563>
tu_kgsl.c: tu_enumerate_devices closed fd previously closed by
tu_physical_device_init function.
Move out the fd closing from tu_physical_device_init function because
they do not belong to it.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11561>
GPU won't be able to write to such BOs, which would to useful for
cmdstream BOs.
Move "bool dump" to the new flags along the way.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10807>
The spec requires at least 2, but says "No specific guarantees are made
about higher priority queues receiving more processing time or better
quality of service than lower priority queues." So, we can just leave the
priorities as a stub.
Fixes dEQP-VK.info.device_properties
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10470>
No handling of Acquire/Release because at the moment scheduler
works as if any barrier is Acq+Rel.
Instead of removing scoped_barrier with scope/mode that for TCS
corresponds to a control_barrier or a memory_barrier_tcs_patch in
ir3_nir_lower_tess_ctrl - remove them in emit_intrinsic_barrier.
And do the same for memory_barrier_tcs_patch and control_barrier.
While in any case hw fence/barrier shouldn't be emitted for them,
they still affect ordering of stores, and in feature ir3 backend
may want to have that information.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9054>
OpTerminateInvocation provides the behavior required by the GLSL
discard statement, which we already implement.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>
The "demote" intrinsic has the semantics of D3D discard, which means
it doesn't change the control flow, allowing derivatives to work.
On A6xx there is no known way to check whether invocation was demoted,
thus we use nir_lower_is_helper_invocation.
Add "logical" OPC_DEMOTE which is later translated to "kill".
Such separation is necessary to run "kill" specific optimizations
which are invalid for "demote".
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>
This needs to be part of the compiler because it's the only piece that
we always have access to in all the places ir3_optimize_loop() is
called, and it's only enabled for the whole Vulkan device. Right now
it's just used for constraining vectorization, but the next commit adds
another use.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
VK_KHR_spirv_1_4 is trivial because vtn already supports all the added
SPIR-V features that aren't gated behind Vulkan extensions. I've
observed some robustness2 CTS tests requiring this. However there are
a few tests currently failing due to lacking spilling.
VK_EXT_scalar_block_layout should also be trivial, since support for
"straddling" UBO loads was added recently for other reasons. This is
used by every robustness2 CTS test.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8695>
If subpass doesn't have depth/color attachments - samples count is
devised from VkPipelineMultisampleStateCreateInfo::rasterizationSamples.
Without variableMultisampleRate enabled all pipelines in such subpass
should have the same samples count; variableMultisampleRate allows
to have pipelines with different number of samples in one subpass,
given that it doesn't have depth/color attachments.
Blob doesn't have it enabled but there is no known reason for this.
Passes:
dEQP-VK.pipeline.multisample.variable_rate.*
Fixes test:
dEQP-VK.pipeline.framebuffer_attachment.no_attachments_ms
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9556>
A650 can use the same SSBO descriptor for both 32-bit and 16-bit access,
which makes it easy to enable this extension.
Passes tests that run under:
dEQP-VK.spirv_assembly.instruction.*.16bit_storage.*
Rebased and modified commit from Jonathan Marek.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
This matches the blob and doesn't require actually implementing controls
since the supported modes are just what the HW does.
Passes tests under:
dEQP-VK.spirv_assembly.*.float_controls.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
Loosely based on ANV implementation.
For executable's internal representation we output:
- Initial NIR after spirv_to_nir
- Final optimized NIR
- IR3 disassembly
Note, that vkGetPipelineExecutablePropertiesKHR is required to
return executable properties even if pipeline was not created with
CAPTURE_STATISTICS or CAPTURE_INTERNAL_REPRESENTATIONS bits set.
So the executables array is unconditionally populated, however
NIR and IR3 disassemlies are filled only when
CAPTURE_INTERNAL_REPRESENTATIONS is set.
Passes dEQP-VK.pipeline.executable_properties.*
Works with RenderDoc.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8877>
v2: link libvulkan_util with libglsl so it can find the glsl singleton symbols.
v3: link with libcompiler instead of libglsl (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> for the v3dv bits.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> for the turnip bits.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> for the radv bits.
Acked-by: Dave Airlie <airlied@redhat.com> for the lvp bits.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9457>
We support VK_KHR_dedicated_allocation so we must fill
VkMemoryDedicatedRequirements.
Vulkan spec states:
"[...] requiresDedicatedAllocation may be VK_TRUE under one of the
following conditions:
The pNext chain of VkImageCreateInfo for the call to vkCreateImage used
to create the image being queried included a VkExternalMemoryImageCreateInfo
structure, and any of the handle types specified in
VkExternalMemoryImageCreateInfo::handleTypes requires dedicated allocation,
as reported by vkGetPhysicalDeviceImageFormatProperties2 in
VkExternalImageFormatProperties::externalMemoryProperties.externalMemoryFeatures,
the requiresDedicatedAllocation field will be set to VK_TRUE."
All handle types require dedicated allocation at the moment.
Fixes:
dEQP-VK.api.external.memory.opaque_fd.dedicated.image.info
dEQP-VK.memory.requirements.dedicated_allocation.buffer.regular
dEQP-VK.memory.requirements.dedicated_allocation.image.transient_tiling_optimal
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9086>
tu_GetBufferMemoryRequirements() ends up wrapping the UINT64_MAX size
to 0 when aligning.
Fixes:
dEQP-VK.api.buffer.basic.size_max_uint64
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4493>
According to Vulkan spec, "Table 46. Required Limits", as sparse
binding is unsupported, we need to return unsupported limit for
sparseAddressSpaceSize, which is zero.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4493>