No handling of Acquire/Release because at the moment scheduler
works as if any barrier is Acq+Rel.
Instead of removing scoped_barrier with scope/mode that for TCS
corresponds to a control_barrier or a memory_barrier_tcs_patch in
ir3_nir_lower_tess_ctrl - remove them in emit_intrinsic_barrier.
And do the same for memory_barrier_tcs_patch and control_barrier.
While in any case hw fence/barrier shouldn't be emitted for them,
they still affect ordering of stores, and in feature ir3 backend
may want to have that information.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9054>
OpTerminateInvocation provides the behavior required by the GLSL
discard statement, which we already implement.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>
The "demote" intrinsic has the semantics of D3D discard, which means
it doesn't change the control flow, allowing derivatives to work.
On A6xx there is no known way to check whether invocation was demoted,
thus we use nir_lower_is_helper_invocation.
Add "logical" OPC_DEMOTE which is later translated to "kill".
Such separation is necessary to run "kill" specific optimizations
which are invalid for "demote".
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9460>
This needs to be part of the compiler because it's the only piece that
we always have access to in all the places ir3_optimize_loop() is
called, and it's only enabled for the whole Vulkan device. Right now
it's just used for constraining vectorization, but the next commit adds
another use.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7573>
VK_KHR_spirv_1_4 is trivial because vtn already supports all the added
SPIR-V features that aren't gated behind Vulkan extensions. I've
observed some robustness2 CTS tests requiring this. However there are
a few tests currently failing due to lacking spilling.
VK_EXT_scalar_block_layout should also be trivial, since support for
"straddling" UBO loads was added recently for other reasons. This is
used by every robustness2 CTS test.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8695>
If subpass doesn't have depth/color attachments - samples count is
devised from VkPipelineMultisampleStateCreateInfo::rasterizationSamples.
Without variableMultisampleRate enabled all pipelines in such subpass
should have the same samples count; variableMultisampleRate allows
to have pipelines with different number of samples in one subpass,
given that it doesn't have depth/color attachments.
Blob doesn't have it enabled but there is no known reason for this.
Passes:
dEQP-VK.pipeline.multisample.variable_rate.*
Fixes test:
dEQP-VK.pipeline.framebuffer_attachment.no_attachments_ms
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9556>
A650 can use the same SSBO descriptor for both 32-bit and 16-bit access,
which makes it easy to enable this extension.
Passes tests that run under:
dEQP-VK.spirv_assembly.instruction.*.16bit_storage.*
Rebased and modified commit from Jonathan Marek.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
This matches the blob and doesn't require actually implementing controls
since the supported modes are just what the HW does.
Passes tests under:
dEQP-VK.spirv_assembly.*.float_controls.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9840>
Loosely based on ANV implementation.
For executable's internal representation we output:
- Initial NIR after spirv_to_nir
- Final optimized NIR
- IR3 disassembly
Note, that vkGetPipelineExecutablePropertiesKHR is required to
return executable properties even if pipeline was not created with
CAPTURE_STATISTICS or CAPTURE_INTERNAL_REPRESENTATIONS bits set.
So the executables array is unconditionally populated, however
NIR and IR3 disassemlies are filled only when
CAPTURE_INTERNAL_REPRESENTATIONS is set.
Passes dEQP-VK.pipeline.executable_properties.*
Works with RenderDoc.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Connor Abbott <cwabbott0@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8877>
v2: link libvulkan_util with libglsl so it can find the glsl singleton symbols.
v3: link with libcompiler instead of libglsl (Jason)
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Dylan Baker <dylan.c.baker@intel.com>
Reviewed-by: Alejandro Piñeiro <apinheiro@igalia.com> for the v3dv bits.
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com> for the turnip bits.
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> for the radv bits.
Acked-by: Dave Airlie <airlied@redhat.com> for the lvp bits.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9457>
We support VK_KHR_dedicated_allocation so we must fill
VkMemoryDedicatedRequirements.
Vulkan spec states:
"[...] requiresDedicatedAllocation may be VK_TRUE under one of the
following conditions:
The pNext chain of VkImageCreateInfo for the call to vkCreateImage used
to create the image being queried included a VkExternalMemoryImageCreateInfo
structure, and any of the handle types specified in
VkExternalMemoryImageCreateInfo::handleTypes requires dedicated allocation,
as reported by vkGetPhysicalDeviceImageFormatProperties2 in
VkExternalImageFormatProperties::externalMemoryProperties.externalMemoryFeatures,
the requiresDedicatedAllocation field will be set to VK_TRUE."
All handle types require dedicated allocation at the moment.
Fixes:
dEQP-VK.api.external.memory.opaque_fd.dedicated.image.info
dEQP-VK.memory.requirements.dedicated_allocation.buffer.regular
dEQP-VK.memory.requirements.dedicated_allocation.image.transient_tiling_optimal
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9086>
tu_GetBufferMemoryRequirements() ends up wrapping the UINT64_MAX size
to 0 when aligning.
Fixes:
dEQP-VK.api.buffer.basic.size_max_uint64
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4493>
According to Vulkan spec, "Table 46. Required Limits", as sparse
binding is unsupported, we need to return unsupported limit for
sparseAddressSpaceSize, which is zero.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4493>
Now that all drivers are converted over, we can make a few changes.
First off, vk_device_init no longer takes two separate allocators
because we can assume that the parent instance is non-null and it can
pull the instance allocator from that. Second, dispatch tables and the
instance extension table are no longer optional. We leave the device
extension table optional for now because we don't do any verification at
vk_init_physical_device time and some drivers find it more convenient to
set the extensions later in their own physical_device_init for various
reasons.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8676>
To support multipass, querying perf counters happens in several steps
below.
0) There's a scratch reg to set pass indices for perf counters query.
Prepare cmd streams to set each pass index to the reg at device
creation time. See tu_CreateDevice in tu_device.c
1) Emit command streams to read all requested perf counters at all
passes in begin/end query with CP_REG_TEST/CP_COND_REG_EXEC, which
reads the scratch reg where pass index is set.
2) Pick the right cs setting proper pass index to the reg and prepend it
to the command buffer at each submit time.
3) If the pass index in the reg is true, then executes the command
stream below CP_COND_REG_EXEC.
Would need to implement for kgsl in the future.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6808>
There are still some commands unimplemented yet.
- vkGetPhysicalDeviceQueueFamilyPerformanceQueryPassesKHR:
The following patch supports this.
- vkAcquireProfilingLockKHR / vkReleaseProfilingLock
This patch supports only monitoring perf counters for each submit.
To reserve/configure counters across submits we would need a kernel
interface to be able to do that.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6808>
Given that we already link to Android's libsync, use it instead of using a
build-time dependency on libdrm for the KGSL path. This also would help
for older kernel compat with KGSL.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6821>
Move legacy API functions to a separate file, and implement them by calling
the new API functions, like tu_CreateRenderPass was already doing.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6920>
Fences are now just a syncobj, which makes our life easier.
The next step will be to fill out ImportFenceFdKHR()/GetFenceFdKHR().
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6683>
We don't use the format anymore in the backend, except determining the
number of components, and we fallback to 4 there if it's not specified.
So we should be safe to enable this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6800>
Passes at least:
dEQP-VK.dynamic_state.vp_state.viewport_array
dEQP-VK.draw.shader_viewport_index.*
dEQP-VK.draw.shader_layer.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5832>
Avoid having to deal with BO tracking. However, the kernel still requires a
bo list, so keep a global one which can be re-used for every submit.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6686>
This moves the semaphore implementation and tu_QueueSubmit to
tu_drm.c, such that that's the only file including xf86drm.h and
msm_drm.h. This way, the entire kernel interface is contained in
tu_drm.c
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5999>
A630 doesn't have the HW format we use to sample stencil, so it needs a
workaround. It also has a bug around the AS_R8G8B8A8 format, which doesn't
work when UBWC is disabled, so use 8_8_8_8_UNORM instead when UBWC is
disabled (using AS_R8G8B8A8 or 8_8_8_8_UNORM should only matter with UBWC)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5438>
Note that there are some extra tess fails, but they're probably
unrelated to the actual feature. There were also some xfails that were
created as part of an earlier attempt to enable the feature which were
fixed in the meantime, so remove them.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5738>
Now that we clear the PITCHALIGN" field when filling GMEM input attachment
descriptors, we can get rid of the extra tile width alignment on a630/a640.
With the "block_align_shift" value change, this brings down the default
gmem_align from 16k to 4k on a630/a640 and down from 24k to 12k on a650,
to match the gallium driver.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5528>
v2. Define new helper function to avoid duplicated a pair of function calls.
v3. Move new helper functions to vk_object.h and call them.
v4. Merge 2 commits to use commomn base object type and struct into one.
Signed-off-by: Hyunjun Ko <zzoon@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5539>
Fill the global bo will all possible shaders for 3D clear/blit. Note the
global bo size is still <4k (so this doesn't cost any extra memory), this
saves having to allocate shaders in sub_cs everytime the 3D path is used.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5776>
* Remove scratch_bo from cmdbuffer, use a device-global bo instead, which
also includes border color (and eventually shaders for 3D blit path)
* Use CP_SET_BIN_DATA5_OFFSET to allow setting VSC buffer addresses only
once at the start of the cmdstream
* Use scratch bo mechanism for a resizable VSC buffer
* Use feedback from "vsc_draw_overflow" and "vsc_prim_overflow" values to
increase the size of VSC buffer when beginning to record a new cmdbuffer
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5570>
Compute the tiling config at framebuffer creation time. A framebuffer will b
be re-used multiple times, so this will avoid having to re-calculate the
tiling config every time a command buffer is recorded.
The tiling config already couldn't use the render area's x1/y1 because of
hw binning, this move makes it so the render area isn't used at all for the
tiling config.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5570>
tu_ImportFenceFdKHR is used by tu_AcquireImageANDROID, which may or
may not work, but let's at least keep things compiling until somebody
has time to tie up the loose ends on the Android side.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5670>
There is only one queue for now, so for non-shared semaphores, the
implementation is basically a no-op. For shared semaphores, this
always uses syncobjs. This depends on syncobj support in the msm
kernel driver.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/2769>
Use ir3_compiler_destroy() rather than open-coding ralloc_free(). This
will give us a place to add more compiler related cleanup code in the
following patches.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5372>
My attempt to be clever here backfired, it overwrites the pNext and stops
the loop (causing deqp to fail to query extension features after that).
Fixes: 62de79ac44 ("turnip: implement VK_KHR_shader_draw_parameters")
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5654>
Note: going by the blob, VFD_INDEX_OFFSET/FD_INSTANCE_START_OFFSET seem
completely unused by indirect draws, so this changes them to only be set
for non-indirect draws (and moves them to the vs_params draw state).
Passes dEQP-VK.draw.shader_draw_parameters.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5635>
This commit fills out VkPhysicalDeviceSubgroupProperties if present
in a VkPhysicalDeviceProperties2. The values here are simply pulled
from the blob.
Fixes some flakes in dEQP-VK.subgroups.* since dEQP was reading
uninitialized values of VkPhysicalDeviceSubgroupProperties.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5564>
Instead of having these functions sprinkled around the driver (and ending
with a duplicated tu6_compare_func for example), move everything to a
common header (using the previously unused tu_util.h).
Also applied some simplifications: using a cast when the HW enum matches
the VK enum, and using a lookup table when it makes sense (which is IMO
nicer than the switch case way).
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5538>
This fixes the newly added cubic blit_image tests for A650, by falling back
to the 3D path and setting the filter correctly.
Note: there are still failures with the texture filtering tests.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5509>
Otherwise A6XX_TEX_SAMP_1_{MIN,MAX}_LOD silently overflows.
This fixes these tests:
dEQP-VK.texture.explicit_lod.2d.derivatives.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5375>
Comparing a blob trace using the feature to one not, the difference was
pretty obvious and in the spot you'd expect compared to alphaToCoverage.
The SP_ reg didn't have a corresponding bit set, though it also has an
alphaToCoverage.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5343>
This makes it a little more explicit that the values line up.
src/freedreno/vulkan/tu_device.c:2209:75: warning: implicit conversion
from enumeration type 'const VkSamplerReductionMode' (aka 'const enum
VkSamplerReductionMode') to different enumeration type 'enum
a6xx_reduction_mode' [-Wenum-conversion]
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5174>
Most changes based on radv, some simplification, since we don't need to
sample multiple planes, 422_UNORM/420_UNORM formats will be supported
directly using the hardware formats for those.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4590>
Device UUID becomes SHA1('freedreno' + gpu_id).
Driver UUID becomes SHA1(mesa-version + git-head-sha1).
v2: Don't use build_id for driver UUID since it generates different
values for vulkan and gl shared objects. (Kristian)
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4847>
The new files are created under a 'common' folder under 'src/freedreno',
where shared functionality between GL and Vulkan drivers (that is not
registers, layout or compiler) will be placed.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4847>
This is simpler than a full-blown memory reuse mechanism, but is good
enough to make sure that repeatedly doing a copy that requires the
linear staging buffer workaround won't use excessive memory or be slowed
down due to repeated allocations.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5007>
New blit code doesn't change this value, and different values seem to be
related to the driver version and not the GPU version.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4608>
Passes tests in:
dEQP-VK.pipeline.multisample.sample_locations_ext.*
Note that these tests fail because of gl_PrimitiveID not working correctly:
dEQP-VK.pipeline.multisample.sample_locations_ext.verify_location.*
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4665>
This appears to be limited by VPC_CNTL_0::NUMNONPOSVAR, which is an
8-bit bitfield with no possibility for expansion. Also, in practice
we'll be limited by the vertex shader output maximum, which includes
gl_Position, of 128, so that users won't be able to use more than 124
components anyways. Lower it to match the GL blob.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4641>
Under the bindless model, there are 5 "base" registers programmed with a
64-bit address, and sam/ldib/ldc and so on each specify a base register
and an offset, in units of 16 dwords. The base registers correspond to
descriptor sets in Vulkan. We allocate a buffer at descriptor set
creation time, hopefully outside the main rendering loop, and then
switching descriptor sets is just a matter of programming the base
registers differently. Note, however, that some kinds of descriptors
need to be patched at command recording time, in particular dynamic
UBO's and SSBO's, which need to be patched at CmdBindDescriptorSets
time, and input attachments which need to be patched at draw time based
on the the pipeline that's bound. We reserve the fifth base register
(which seems to be unused by the blob driver) for these, creating a
descriptor set on-the-fly and combining all the dynamic descriptors from
all the different descriptor sets. This way, we never have to copy the
rest of the descriptor set at draw time like the blob seems to do. I
mostly chose to do this because the infrastructure was already there in
the form of dynamic_descriptors, and other drivers (at least radv) don't
cheat either when implementing this.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4358>
* Correct bypass value for a618
* Bypass value for blitter
* Don't set RB_CCU_CNTL again unnecessarily in tu6_emit_binning_pass
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3783>
Geometry shaders support an invocations parameter up to a limit
defined by maxGeometryShaderInvocations. This was set to 127, but
executing with invocations > 32 causes a crash. As it turns out, the
blob only advertises a max of 32 invocations, so we set that in
turnip as well.
Fixes dEQP-VK.geometry.instanced.draw_*_instances_{127, 64}_geometry_invocations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4436>
Previously, turnip advertised 4-bit subpixel precision when in
practice, a6xx seems to render with 8-bit precision. This caused
dEQP-VK.renderpass2.suballocation.subpass_dependencies.late_fragment_tests.*
to fail because they compare images rendered with turnip against
ones rendered via a software reference implementation parameterized
by turnip's VkPhysicalDeviceLimits.subPixelPrecisionBits value.
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4172>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4172>
Emit a single table of all possible Vulkan border colors up front, and
then index into it using the Vulkan enum directly. In fact this seems to
be the entire point of separating out border colors in the first place.
In addition to being simpler and having less CPU overhead, and fixing
cases where more than one sampler uses border color, this paves the way
for bindless samplers because the existing approach isn't great for
bindless.
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4200>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4200>
Newer a6xx no longer has programmable GMEM base, so we can't rely on the
kernel driver setting it to 0x100000 (GMEM base is 0 on such GPUs).
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3979>
There's still a TODO related to key->sample_shading, but it doesn't look
like it changes anything in ir3, so it works without that.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3923>
This has only lightly been tested. It passes dEQP-VK.api.smoke.triangle,
so at least we're able to show a triangle. For now, it's just enabled
under a debug flag. In the future we'll probably want some heuristics
like what freedreno has and another debug flag to disable it except when
it's forced.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3713>
I merely ported a freedreno patch to turnip which
updates some magic regsiter values.
commit ff6e148a3d
Author: Rob Clark <robdclark@chromium.org>
CommitDate: Tue Oct 29 09:19:34 2019 -0700
Subject: freedreno/a6xx: add a618 support
That's all that Rob did for gallium for a618, so I assume that's we need
for turnip also.
Tested manually with:
dEQP-VK.api.image_clearing.core.clear_color_image.2d.linear.single_layer.*
pass 300/555
fail 0/555
skip 255/555
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3743>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3743>
The value of some magic regsiters differ across chipsets. fd6_context
manages the differences by initializing them at runtime. Let's do the
same.
Add to tu_physical_device a subset of those found in fd6_context:
RB_UNKNOWN_8E04_blit
RB_CCU_CNTL_gmem
PC_UNKNOWN_9805
SP_UNKNOWN_A0F8
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3743>
In addition to preparing us for dynamically resizing them, which has to
be controlled by the device, this greatly reduces the memory usage when
allocating large numbers of command buffers, making
dEQP-VK.api.object_management.max_concurrent.command_buffer_primary go
from crash -> pass.
Reviewed-by: Kristian H. Kristensen <hoegsberg@gmail.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3621>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3621>
Pretty straightforward: Port texture descriptor code from freedreno, fill
in alignment limits from closed vk, and tu_cmd_buffer.c was already
uploading the texture descriptor.
This doesn't implement storage texel buffers (required in the compute
pipeline) yet, since those will need an IBO descriptor for the store path.
Still, making the load path be connected to the texture descriptor won't
hurt.
Part of #2237
Fixes dEQP-VK.binding_model.shader_access.primary_cmd_buf.uniform_texel_buffer.*
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3522>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3522>
It looks like the actual tile alignment requirement is less than 32x32, but
in some cases input attachment texture needs 64 alignment.
Reduced the h alignment to 16 to compensate and it seems to work fine.
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Eric Anholt <eric@anholt.net>
This gets us shared non-UBWC layout code between gallium and turnip.
Until I fix up the rest of gallium to handle UBWC mipmapping, we do the
single-level UBWC setup in gallium as a fixup after layout.
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Fixes the deqp fails in:
dEQP-VK.pipeline.sampler.*border*
(minus 1d array/d24 cases which fail for other reasons)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
This is enough to pass
dEQP-VK.binding_model.shader_access.primary_cmd_buf.storage_buffer.fragment.single_descriptor.*
with fragmentStoresAndAtomics set, and thus to be able to start working on
compute. I haven't enabled that flag yet, because it also implies image
load/store support, which I haven't filled in.
Reviewed-by: Jonathan Marek <jonathan@marek.ca>
They're not implemented, and not critical to bring up immediately. Avoids
failures in the CTS when nothing gets written to the query.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
A640 seems to work without any other changes (glmark and vkcube).
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
The resulting locale is not used for Vulkan, and it is not reference
counted, giving issues when multiple instances are created.
CC: 19.2 19.3 <mesa-stable@lists.freedesktop.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
../src/freedreno/vulkan/tu_device.c:900:4: error: initializer element is not constant
.minImageTransferGranularity = (VkExtent3D) { 1, 1, 1 },
^
Suggested-by: Kristian Høgsberg <krh@bitplanet.net>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110698
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Rob Clark <robdclark@gmail.com>
These were updated in version 1.1.106 of vulkan.h to make more sense
with the extension names. We may as well keep with the times.
See also: 90108deb27 "anv: Update to use the new features struct names"
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
The fd is -1, thus the block of if (fd != -1) close(fd) is dead code.
Cc: Chad Versace <chadversary@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
That is, drop KHR from all tokens that were promoted to Vulkan 1.1.
The consistency makes ctags more useful (it now jumps directly to the
real definitions in vulkan_core.h instead of the typedefs); and it makes
the code slightly less verbose.
Save SPIR-V in tu_shader_module. Tranlation to NIR happens in
tu_shader_create, and compilation to binary code happens in
tu_shader_compile. Both will be called during pipeline creation.
This should be quite complete feature-wise. External fences are
still missing. We probably also want to add a simpler path to
tu_WaitForFences for when fenceCount == 1.
Build drm_msm_gem_submit_bo array directly in tu_bo_list. We might
change this again, but this is good enough for now.
There are other issues as well, such as not using
VkAllocationCallbacks and sloppy error checking. We should revisit
this in the near future. Same to tu_cs.
This creates a new fd on each queue submit. I do not go with
DRM_IOCTL_MSM_WAIT_FENCE solely because the path is marked legacy.
Otherwise, we can use the fence id rather than requesting a fence
fd until external fences are supported and enabled.