Engines based contexts operate somewhat different for executing
batches. Previously, we would specify a bitmask value such as
I915_EXEC_RENDER to specify to run the batch on the render ring.
With engines contexts, instead this becomes an array of "engines", and
when the context is created we specify the class and instance of the
engine.
Each index in the array has a separate hardware-context. Previously we
had to create separate kernel level contexts to create multiple
hardware contexts, but now a single kernel context can own multiple
hardware contexts.
Another forward looking advantage to using the engines based contexts
is that the kernel does not plan to add new supported I915_EXEC_FOO
masks, whereas they instead plan to add new I915_ENGINE_CLASS_FOO
engine classes. Therefore some rings may only be usable with an engine
based class.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12692>
Instead of having a bunch of const vk_sync_type for each permutation of
vk_drm_syncobj capabilities, have a vk_drm_syncobj_get_type helper which
auto-detects features. If a driver can't support a feature for some
reason (i915 got timeline support very late, for instance), they can
always mask off feature bits they don't want.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
BO waits aren't going away any time soon so using a syncobj here doesn't
really gain us anything. It just makes it more complicated.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
These are entirely unused except for the syncobj in submit_simple_batch.
We can use the libdrm wrappers for that as they're basically equivalent
and the core Vulkan sync code already depends on them.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
This is, unfortunately, a large flag-day mega-commit. However, any
other approach would likely be fragile and involve a lot more churn as
we try to plumb the new vk_fence and vk_semaphore primitives into ANV's
submit code before we delete it all. Instead, we do it all in one go
and accept the consequences.
While this should be mostly functionally equivalent to the previous
code, there is one potential perf-affecting change. The command buffer
chaining optimization no longer works across VkSubmitInfo structs.
Within a single VkSubmitInfo, we will attempt to chain all the command
buffers together but we no longer try to chain across a VkSubmitInfo
boundary. Hopefully, this isn't a significant perf problem. If it ever
is, we'll have to teach the core runtime code how to combine two or more
VkSubmitInfos into a single vk_queue_submit.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
These are about to be the only use of anv_gettime_ns() so lets switch
them over so the next patch can delete the helper outright.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
This gets rid of the bespoke vfunc interface in the WSI code.
Eventually, I'd love to get rid of wsi_display_fence entirely but
this at least makes it a lot more palatable.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13427>
This effectively partially reverts 13fe43714c ("anv: Add helpers in
anv_allocator for mapping BOs") where we both added helpers and reworked
memory mapping to stash the maps on the BO. The problem comes with
external memory. Due to GEM rules, if a memory object is exported and
then imported or imported twice, we have to deduplicate the anv_bo
struct but, according to Vulkan rules, they are separate VkDeviceMemory
objects. This means we either need to always map whole objects and
reference-count the map or we need to handle maps separately for
separate VkDeviceMemory objects. For now, take the later path.
Fixes: 13fe43714c ("anv: Add helpers in anv_allocator for mapping BOs")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5612
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13795>
We need to guarantee that when vkQueueSubmit() returns the application
can actually wait on a signaled semaphore/syncobj.
When using a thread to do the submission to i915, this gets a bit
tricky in the following case :
A syncobj is used both as a wait & signal semaphore and has been
signaled once already. It contains a fence before entering
vkQueueSubmit().
This means we need to reset the syncobj to ensure when we return
from vkQueueSubmit(), the syncobj contains no stale fence.
Currently in the Anv, the submission thread is in charge of putting
the new fence in the syncobj and also picks up the wait fence
directly from the syncobj. This means we can't reset the syncobj
from vkQueueSubmit().
The solution to this has been pointed by Bas & Jason :
In vkQueueSubmit(), clone the wait syncobj fence into a new
temporary syncobj that will be destroy after submission and use
this temporary syncobj as a wait fence for i915. This allows us to
reset the original syncobj in vkQueueSubmit().
For this to work with wait_before_signal behavior, we also need to
do a wait-on-materialize on binary semaphores from vkQueueSubmit().
Otherwise the application thread calling vkQueueSubmit() could race
the submission thread and pick up the wrong fence when cloing.
v2: Use copy semantic for clone_syncobj_dma_fence() (Jason)
Do the cloning prior to adding the syncobj to anv_queue_submit so
that if the cloning fails don't have an invalid syncobj in
anv_queue_submit (Jason)
v3: Fix another syncobj leak (Jason)
v4: Fix invalid argument order (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4945
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11474>
From VUID-VkAttachmentReference2-attachment-04755:
"If attachment is not VK_ATTACHMENT_UNUSED, and the format of the
referenced attachment is a depth/stencil format which includes both
depth and stencil aspects, and layout is
VK_IMAGE_LAYOUT_DEPTH_ATTACHMENT_OPTIMAL or
VK_IMAGE_LAYOUT_DEPTH_READ_ONLY_OPTIMAL, the pNext chain must include
a VkAttachmentReferenceStencilLayout structure."
We did not check that there even could be a stencil layout
before fetching it.
Fixes a few tests from:
dEQP-VK.image.depth_stencil_descriptor.depth_read_only_optimal.*
Fixes: 979ea394e5 "vulkan/util: Move
helper functions for depth/stencil images to vk_iamge"
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13740>
We reference the scratch BO using a bindless index in the command
streamer instructions, but we forgot to add them to the BO list.
v2: Make use of pipeline reloc list (Jason)
v3: Don't add NULL BOs to the reloc list (Lionel)
v4: Don't add BOs twice to reloc list when dealing with addresses
(Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: eeeea5cb87 ("anv: Add support for scratch on XeHP")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13544>
If we ever want to stop depending on the EXEC_OBJECT_PINNED to detect
when something is pinned (like for VM_BIND), having a helper will reduce
the code churn. This also gives us the opportunity to make it compile
away to true/false when we can figure it out just based on compile-time
GFX_VERx10.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
Starting with 3b363d5b55 ("anv: Assume syncobj support"), we assume
syncobj support and no longer use the execbuf sync_file API directly so
there's no point in checking for it. For the one physical device check
this deletes, we can assume has_exec_fence is always true because every
kernel with syncobj support also has sync_file.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
Soft-pin is but one possible mechanism for pinning buffers. We're
working on another called VM_BIND. Most of the time, the real question
we're asking isn't "are we using soft-pin?" but rather "are we using
relocations?" because it's relocations, and not soft-pin, that cause us
all the extra pain we have to write code to handle. This commit flips
the majority of those checks around. The new helper is currently just
the exact inverse of the old use_softpin helper.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
When the client calls vkMapMemory(), we have to align the requested
offset down to the nearest page or else the map will fail. On platforms
where we have DRM_IOCTL_I915_GEM_MMAP_OFFSET, we always map the whole
buffer. In either case, the original map may start before the requested
offset and we need to take that into account when we clflush.
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
This is left-over from the days where we didn't use BO pointers and had
them embedded and we didn't have a nice allocation API. Now that block
pools get their memory via anv_device_alloc_bo(), they really should be
using anv_device_release_bo() to tear it down. This is equivalent in
this case because anv_device_release_bo() does the following:
1. Decrements refcount. The pool holds the only reference so it will
proceed onto the clean-up steps
2. Unmaps it, if needed. This is the same as the tear-down code today
except anv_device_release_bo() will avoid doing an unmap if it's
been created via userptr. The current code probably unmaps the
userptr which is wrong but pretty harmless since it happens on a
tear-down path.
3. Unmaps the CCS range from the AUX-TT, if any. There is none, so
this is a no-op.
4. Frees the VA range if it's not pinned and doesn't have a fixed VA
assignment. These BOs always either have a fixed VA range (softpin)
or aren't pinned (relocations).
5. Closes the GEM handle. Same as the current code.
In short, anything created using the BO API should probably be destroyed
that way. We were getting away with hand-rolling it because this is a
simple case. Why did we do it this way? It dates back to before the
more formlized BO cache and BO API in ANV.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
The anv_bo_finish helper is explicitly supposed to be capable of tearing
down partially completed BOs as long as the flag and metadata parameters
are set correctly.
Reviwed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13610>
We don't current enable post sync operations, but it is probably
better to set them to "internal" MOCS than to remove the non-zero
checking for this genxml field.
Reworks:
* Fix COMPUTE_WALKER in cmd_buffer_trace_rays (s-b Jason)
Fixes: 7b78b2fcac ("intel/genxml: Assert that all MOCS fields are non-zero on Gfx7+")
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13624>
On GFX version 12.5+ with COMPUTE_WALKER, this is the limit based on the
size of the HW packet. On older HW, we can technically go a bit bigger
but there's not much point. Technically, some hardware can support a
scalar workgroup size up to 2048 but most apps don't go any bigger than
1024.
As discussed on the merge request page, the current limit assumes
SIMD32, but it is unclear if we want to encourage applications to use
SIMD32 if it may lead to additional register spilling in shader
programs. Many applications have likely tuned for a limit of 1024
based on the OpenGL minimum limit, so it might not gain much by
advertising more than 1024.
Reworks:
* Jordan: Use MIN2 and limit total invocations as well.
* Jordan: Add second paragraph to commit message based on merge
request discussion.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13538>
Patch moves initialization of variable so that we have fd when calling
wsi initialization.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12305>
Enum values for MESA_SHADER_TASK and MESA_SHADER_MESH are larger than
MESA_SHADER_FRAGMENT, so can't rely on the them for ordering anymore.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13637>
We could decouple the locations in the array from the gl_shader_stage
enum values, but for now this is convenient.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13637>
The code is correct, but compiler can't see it. Initialize the value
to NULL and assert on it if the function succeeds. It both helps the
compiler and make the code slightly more robust.
```
../src/intel/vulkan/anv_queue.c: In function ‘anv_QueueSubmit2KHR’:
../src/intel/vulkan/anv_queue.c:932:16: warning: ‘impl’ may be used uninitialized in this function [-Wmaybe-uninitialized]
932 | result = anv_queue_submit_add_timeline_wait(queue, submit,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
933 | &impl->timeline,
| ~~~~~~~~~~~~~~~~
934 | value);
| ~~~~~~
../src/intel/vulkan/anv_queue.c:899:31: note: ‘impl’ was declared here
899 | struct anv_semaphore_impl *impl;
| ^~~~
```
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13638>
We'd like to add safeguards against accidental use of MOCS 0 (uncached),
which can have large performance implications. One case where we use
MOCS of 0 is disabled stream output targets, MOCS shouldn't matter, as
there's no actual buffer to be cached.
That said, it should be harmless to set MOCS for these null stream
output buffers; we can just assume a MOCS for generic internal buffers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We'd like to add safeguards against accidental use of MOCS 0 (uncached),
which can have large performance implications. One case where we use
MOCS of 0 is 3DSTATE_VERTEX_BUFFERS where we set NullVertexBuffer.
It shouldn't matter here, as there's no actual buffer to be cached.
That said, it should be harmless to set MOCS for null vertex buffers.
We can assume an internal buffer and request isl's vertex buffer MOCS.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
This avoids MOCS != 0 assertions in later patches. iris also does this,
and we do it for the 3DSTATE_CONSTANT_ALL packet path as well. It's a
bit pointless, but it should hopefully be harmless also.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We were only setting this on Gfx9+. It's MBZ on Gfx8, but it exists
on Gfx7.x and doesn't have those restrictions there; we should set it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
isl now uses info->mocs regardless of whether there's any actual
depth/stencil/HiZ buffers involved, so pass it a legitimate one,
rather than zero. When we have entirely NULL surfaces, we just
default to isl's MOCS value for an internal depth buffer.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
While filling out surface state, pass correct aux usage for storage
images as we support compression on XeHPG.
v2: (Jason Ekstrand)
- Move assertion down a bit
- Use general layout aux usage
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3606>
Tigerlake revision 0 is an early stepping that should not be used in
production anywhere, so this code was only used for hardware bringup.
We can drop the unnecessary workarounds. This also keeps them from
triggering on early steppings of other Gfx12 parts.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13266>
Fix defect reported by Coverity Scan.
Assign instead of compare (PW.ASSIGN_WHERE_COMPARE_MEANT)
assign_where_compare_meant: use of "=" where "==" may have been intended
Fixes: 35315c68a5 ("anv: Use the common wrapper for GetPhysicalDeviceFormatProperties")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13395>
INTEL_DEBUG is defined (since 4015e1876a) as:
#define INTEL_DEBUG __builtin_expect(intel_debug, 0)
which unfortunately chops off upper 32 bits from intel_debug
on platforms where sizeof(long) != sizeof(uint64_t) because
__builtin_expect is defined only for the long type.
Fix this by changing the definition of INTEL_DEBUG to be function-like
macro with "flags" argument. New definition returns 0 or 1 when
any of the flags match.
Most of the changes in this commit were generated using:
for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do
perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c
perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c
done
but it didn't handle all cases and required minor cleanups (like removal
of round brackets which were not needed anymore).
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>
This sets the conformance version to 0.0.0.0 for GPUs that have
incomplete support for vulkan, so that it's easier to check if vulkan is
fully supported by a GPU at runtime for applications/libraries.
$ vulkaninfo|grep conf
MESA-INTEL: warning: Ivy Bridge Vulkan support is incomplete
conformanceVersion = 0.0.0.0
Signed-off-by: Clayton Craft <clayton@craftyguy.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13275>
This patch adds Tessellation Distribution on top of Geometry
Distribution. Using recommended values based on performance studies
across a range of workloads.
Rework:
- Add comment for new packet bits (Sagar)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12091>
Using recommended values based on performance studies across a range
of workloads.
Rework:
* Always enable geometry distribution
* Set ListCutIndexEnable if primitive restart is enabled
* Set distribution mode based on TEEnable
* Add comment explaining the 3DSTATE_VFG bits (Sagar)
v2:
- Emit 3DSTATE_VFG dynamically based on primitive restart (Ken)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12091>
Small typo/copy-paste.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c0093c4668 ("anv: Flip around the way we reason about storage image lowering")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13332>
Otherwise we hit assert in vk_object_base_assert_valid when attemping to
create handle from anv_fence with unknown base type.
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13330>
Common queue submit expects pWaitDstStageMask to be set per each
semaphore (as per Vulkan spec) and crashes if these are not given
properly.
This fixes crashes seen when running vulkan apps on Android.
v2: change the VkPipelineStageFlags given (Lionel)
Fixes: b996fa8efa ("anv: implement VK_KHR_synchronization2")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13305>
Instead of doing a tile cache flush after slow clears, resolves, and
ambiguates, do it before fast clears of HIZ_CCS_WT surfaces. This agrees
with the Bspec.
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11539>
There are roughly two cases when it comes to storage images. In the
easy case, we have full hardware support and we can just emit a typed
read/write message in the shader and we're done. In the more complex
cases, we may need to fall back to a typed read with a different format
or even to a raw (SSBO) read.
The hardware has always had basically full support for typed writes all
the way back to Ivy Bridge but typed reads have been harder to come by.
Starting with Skylake, we finally have enough that we at least have a
format of the right bit size but not necessarily the right format so we
can use a typed read but may still have to do an int->unorm or similar
cast in the shader.
Previously, in ANV, we treated lowered images as the default and write-
only as a special case that we can optimize. This flips everything
around and treats the cases where we don't need to do any lowering as
the default "vanilla" case and treats the lowered case as special.
Importantly, this means that read-write access to surfaces where the
native format handles typed writes now use the same surface state as
write-only access and the only thing that uses the lowered surface state
is access read-write access with a format that doesn't support typed
reads. This has the added benefit that now, if someone does a read
without specifying a format, we can default to the vanilla surface and
it will work as long as it's a format that supports typed reads.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13198>
Description by Coverity:
"Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
overflow_before_widen: Potentially overflowing expression 1 << b with type int
(32 bits, signed) is evaluated using 32-bit arithmetic, and then used in
a context that expects an expression of type VkAccessFlags2KHR (64 bits,
unsigned)"
CID: 1492745
CID: 1492748
Fixes: b996fa8efa ("anv: implement VK_KHR_synchronization2")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13284>
AHARDWAREBUFFER_USAGE_CAMERA_MASK enum is defined later and gets
included in the stub headers.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13255>
Make u_vector_init a wrapper to u_vector_init_pot. Let both take
(element_count, element_size) as parameters.
Motivated by eed0fc4caf ("vulkan/wsi/wayland: fix an invalid
u_vector_init call")
v2: rename u_vector_init_pot to u_vector_init_pow2
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13201>
The VK_ERROR_FRAGMENTED_POOL and VK_ERROR_OUT_OF_POOL_MEMORY errors are
not as exceptional cases as most. These are expected to be hit by
applications in the normal course of doing their thing. Probably best
not to spam stderr and the debug logs with them.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13045>
Now that all spirv_to_nir() users take care of converting sysvals to
varyings, we can unconditionally declare FragCoord as a sysval.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13017>
This is an attempt at simplifying the spirv_to_nir() backend when it
comes to choosing between system values and input varyings. Let's patch
drivers to do the sysval to input varying conversion on their own so we
can get rid of the frag_coord_is_varying field in spirv_to_nir_options
and unconditionally create create sysvals for FragCoord, FrontFacing and
PointCoord inputs instead of adding new xxx_is_{sysval,varying} flags.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13017>
v2: Use u_foreach_bit64() (Samuel)
v3: Add missing handling of VkMemoryBarrier2KHR in pNext of
VkSubpassDependency2KHR (Samuel)
v4: Remove unused ANV_PIPELINE_STAGE_PIPELINED_BITS (Ivan)
v5: fix missing anv_measure_submit() (Jason)
constify anv_pipeline_stage_pipelined_bits (Jason)
v6: Split flushes & invalidation emissions on
vkCmdSetEvent2KHR()/vkCmdWaitEvents2KHR() (Jason)
v7: Only apply flushes once on events (Jason)
v8: Drop split flushes for this patch
v9: Add comment about ignore some fields of VkMemoryBarrier2 in
VkSubpassDependency2KHR (Jason)
Drop spurious PIPE_CONTROL change s/,/;/ (Jason)
v10: Fix build issue on Android (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9045>
New access flags & pipeline stages got added for transform feedback
and we missed handling them.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9045>
When enabling a new feature we made the mistake of initializing some fields
of the device object conditionally, which leads to crashes later. Initializing
those fields would be a trivial fix, but it's probably better to just zero
everything at allocation time and prevent any future screwups. Device objects
are allocated rarely enough for this additional memset to not matter for
performance.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13221>
We do, however, leave a nice tombstone comment in case anyone comes
looking. Given the age and scarcity of Cherry View hardware and ASTC
apps that run on desktop Linux, it's unlikely we'll ever bother to
implement it.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13206>
these were duplicated all over the place, and it's annoying to have to keep
duplicating them any time a new component includes the vulkan header
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13141>
v2 (Jason Ekstrand):
- Switch the order of arguments to be device, image, other stuff
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
This is similar to a patch from Lionel except works in terms of aspects
rather than bindings. This makes it easy to use from the Android code.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
We're setting the same field a dozen lines below to the exact same
value.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13169>
In preparation for adding support for the graphics mesh pipeline,
identify all the paths that are specific the primitive pipeline.
This shouldn't change any behavior since the code currently only
supports the primitive pipeline.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13047>
So that we can use the active_stages in copy_non_dynamic_state() later.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13047>
This is already handled by vk_device_init(); drivers no longer
need to do it themselves.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12867>
There are exceptions in spec where the framebuffer image format and
format given for render pass attachment may differ. This happens in
particular when subpass has resolve attachment that might resolve
only depth from a combined depth+stencil format. There the formats do
not need to match but be 'compatible' with each other.
As example using VK_FORMAT_D32_SFLOAT format is considered compatible
when actual framebuffer format is VK_FORMAT_D32_SFLOAT_S8_UINT.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13082>
emit_slice_hashing_state gets called once per queue, meaning
device->slice_hash can get allocated multiple times. This can be
reproduced by setting the env-var ANV_QUEUE_OVERRIDE=gc=2.
Reworks:
* Only pack the struct once (s-b Lionel, Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13114>
Reworks:
* Set BLORP_BATCH_USE_COMPUTE in anv_blorp_batch_init (s-b Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
Reworks:
* Let blorp_clear handle DEBUG_BLOCS
* Old subject was: "Use compute blorp for vkCmdFillBuffer with
INTEL_DEBUG=blocs"
* Old subject was: "anv/blorp: Support params.cs_prog_data being set"
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
This knowledge was repeated in multiple places so move the values to
intel_device_info struct.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13014>
Since only Anv uses the value, I'm only enabling this on anv.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 518693c3ec ("spirv: Handle the SubgroupSize execution mode")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13034>
The populate_base_prog_key will set VARYING depending if the pipeline
flag is used. Later, when full subgroups flag is set, it will flip to
UNIFORM -- which for compute shaders is effectively the same, so don't
bother setting it again.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12946>
The lowering passes will soon be moved to another function, so there
won't be any choice.
As a side benefit, this allows eliminating the uses_atomic_load_store
**pointer** parameter from brw_nir_lower_storage_image. For some reason
crocus was passing false instead of NULL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>
This consolidates several duplicated pieces of code into devinfo.
max_scratch_ids is an array that provides the max number of threads
for the rendering and compute stages.
This fixes some exceptions missed by crocus for scratch ids on haswell
and cherryview.
It also fills out devinfo->max_scratch_ids properly for stages VS
through CS on Gfx12.5. But, functionally this should not make a
difference as Gfx12.5 already uses COMPUTE for all stages.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12799>
By default, Mesa's X11 Vulkan WSI will wait for buffers to be ready
before submitting them to Xwayland when the swapchain is created
with the IMMEDIATE mode.
This is undesirable when the Wayland compositor already monitors
fences. A Wayland compositor may want to know the delay between
the buffer submition and the end of the GPU work, this is impossible
to measure if the WSI waits for the buffer to be ready before
submission.
Since most compositors don't monitor fences, let's introduce a driconf
option for this for now. We can reconsider once more compositors
have better support for fences.
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11290>
With LSC support, we can do 64-bit atomics with A32/64 messages.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12566>
For now, only mark the 4x8BitPacked variants as accelerated.
Applications are unlikely to use the "add with saturate" opcodes from
VK_INTEL_shader_integer_functions2, so, technically, all of the
AccumulatingSaturating variants "[provide] a performance advantage over
user-provided code composed from elementary instructions..." on all
Intel platforms. If we encounter an application that cares, we can do
things differently then. Ditto for the non-packed 8Bit, 4-element
vector variants.
v2: Don't memset props as this also zeros sType and pNext. Noticed by
Georg Lehmann in !12617.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12624>
This commit does several things:
* Unify code common to several drivers by evaluating INTEL_NO_HW within
intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
code simplification, but we may find reason to undo this later on.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
The prior method of checking the result of getenv() for NULL would cause
the feature to be enabled for INTEL_NO_HW=0.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
Implement the workarounds in anv and iris instead.
Before this commit, ISL unconditionally modified workaround registers
while filling out depth stencil state. To account for this, drivers
unconditionally stalled prior to emitting depth stencil packets. This
hurt performance.
By having the drivers perform the workarounds, they can choose when to
modify the relevant registers. The drivers now avoid emitting the
workaround for NULL depth buffers. This reduces stalls and leads to
better performance.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (the ISL/Anv bits)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (the Iris bits)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>
Instead of making LMEM the special case, unify the two paths by setting
up a fake drm_i915_query_memory_regions struct and filling it out based
on OS queries. The important functional change here is that we now pass
system memory through the same GTT size and 3/4 filter that we were
using with the OS queries. This should make behavior consistent on
integrated GPUs regardless of whether or not we have the memory region
query API.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12433>
Even though we can't really do the parsing on behalf of the driver (it's
too complicated), storing it in the vk_image lets us provide a common
implementation of vkGetImageDrmFormatModifierPropertiesEXT(). It'll
also be useful in the next few commits for swapchain images.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12023>
In theory, with linear vs. tiled differences, it could be different
(RGBA vs. RGB etc.) but it won't matter for the two checks we do with
it. Also, we probably want to be checking the real format here anyway.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12023>
This is mostly a bit of future-proofing. We never end up with offsets
that don't fit in 32 bits today because, thanks to driver limitations
caused by relocations, we don't allocate buffers bigger than 2GB today.
However, if we ever did, it's possible to create a surface on modern
platforms that consumes more than 4GB and we would end up with wrapping
in our offset calculations.
Acked-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11765>
When creating an image out of a swapchain on Android, the android
layer call will detect a VkBindImageMemorySwapchainInfoKHR in the
pNext chain of the vkBindImageMemory2() call and add a
VkNativeBufferANDROID in the chain. This is what we should use as
backing memory for that image.
v2: Fix a couple of obvious mistakes (Tapani)
v3: Silence build warning (Lionel)
Fix invalid object argument to vk_error() (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: bc3c71b87a ("anv: don't try to access Android swapchains")
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5180
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12244>
It's called anv_image_* so it really should take an anv_image. For the
couple of cases where we really want to pass in a set of aspects, we
leave an anv_aspect_to_plane() helper. anv_image_aspect_to_plane() is
then just a wrapper around it which grabs the aspects from the image.
While we're in the area, sprinkle some const around.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
The new versions should have identical output, just a simpler (and
probably faster) implementation and more/better asserts.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
The Vulkan 1.2.184 spec says:
"When creating a VkImageView, if sampler Y′CBCR conversion is
enabled in the sampler, the aspectMask of a subresourceRange used by
the VkImageView must be VK_IMAGE_ASPECT_COLOR_BIT.
When creating a VkImageView, if sampler Y′CBCR conversion is not
enabled in the sampler and the image format is multi-planar, the
image must have been created with
VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT, and the aspectMask of the
VkImageView’s subresourceRange must be VK_IMAGE_ASPECT_PLANE_0_BIT,
VK_IMAGE_ASPECT_PLANE_1_BIT or VK_IMAGE_ASPECT_PLANE_2_BIT."
Previously, for YCbCr images, we were flipping this around. For single-
plane views where VK_IMAGE_ASPECT_PLANE_N_BIT would be passed in by the
app, we would store VK_IMAGE_ASPECT_COLOR_BIT. For multi-plane views
where the client says VK_IMAGE_ASPECT_COLOR_BIT, we would store a all of
the planes. (There was also an extra bit of remapping that would
compact the planes in the non-existent case of a format with a non-
contiguous set of planes.) The idea behind this was that for things
like rendering or single-plane sampling, storage, or compute, we want it
to look as much like a single-plane image as possible but we wanted the
multi-plane case to be the awkward one.
This commit changes it around so that iview->aspects is always exactly
the subset of image->vk.aspects represented by the view. This is
identical to how aspects work for depth/stencil so it gains us some
consistency.
This commit also changes anv_image_view::aspect_mask to aspects to force
a full audit of the field. As can be seen, there are only a few uses of
this field and they're all mostly fine:
- A bunch of them are used to check for depth/stencil. That hasn't
changed.
- Most of the checks for color already used ANY_COLOR_BIT, only one
needed fixing.
- There's a check that both src/depth are color for MSAA resolves.
However, we don't support MSAA on YCbCr so there's no point in
checking for ANY_COLOR_BIT.
There is a hidden usage of planes in anv_descriptor_set_write_image_view
that's not as obvious. However, this function simply looks at
anv_image_view::n_planes and blindly fills out the descriptor
accordingly. As long as image views with a single plane continue to
claim n_planes == 1, this will be fine.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
Previously, we initialized vplane in anv_CreateImageView to 0 and
incremented it every iteration of the aspect loop. This only works
because planes are guaranteed to be in aspect-bit-order which wasn't
documented anywhere. Instead, drop this assumption and burn a couple
CPU cycles properly calculating vplane.
While we're here, make iplane const as well.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
When creating a single-plane view of a multi-plane image, we were
relying on vplane_aspect to be VK_IMAGE_ASPECT_COLOR_BIT so that
anv_get_format_plane of the single-plane view format would work.
Instead of relying on this quirk, we can drop vplane_aspect and rely
entirely on vplane to only be 0 in this case. In the case of depth or
stencil images, we still need to grab the format aspect but we can use
the actual aspect and don't need the vplane_aspect trickery.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
Once we get past depth/stencil, what we really want is plane 0 not the
color aspect. A bunch of those formats don't have a single color
aspect.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
Unlike anv_get_format_aspect, this takes a plane number which is
relative to the set of aspects on the format. There are a number of
cases where we already have the plane and so re-fetching it is useless.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
The comment about modifiers is bogus because we check the modifier
before this check and return early. Also, there's no reason why we need
to check the requested aspect when we could check the format itself.
anv_image_aspect_to_plane will ensure that the requested aspect is one
that actually exists.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
Vulkan allows us to, in theory, support ycbcr on single-plane formats if
the client really wants it. Also, these functions should work on a
multi-plane color image as long as the client specifies the right
aspect. This gets rid of our usage of can_ycbcr outside of anv_image.c.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
This commit reverts 58e9371141 as now iris driver can import stencil.
This makes ext_external_objects-vk-stencil-display pass and X-Plane 11
vulkan rendering backend to work with anv + iris.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eleni Maria Stea <elene.mst@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10609>
This makes import easier on different gfx generations and we don't
have to lock down on a certain aux layout just yet.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10609>
Add support for driconf overrides on a per-device level, for cases
where we don't want to override behavior for all devices supported
by a particular driver.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12135>
There are two problems with the current architecture.
In OpenGL, the id is supposed to be a unique identifier for a particular
log source. This is done so that applications can (theoretically)
filter particular log messages. The debug callback infrastructure in
Mesa assigns a uniqe value when a value of 0 is passed in. This causes
the id to get set once to a unique value for each message.
By passing a stack variable that is initialized to 0 on every call,
every time the same message is logged, it will have a different id.
This isn't great, but it's also not catastrophic.
When threaded shader compiles are used, the id *pointer* is saved and
dereferenced at a possibly much later time on a possibly different
thread. This causes one thread to access the stack from a different
thread... and that stack frame might not be valid any more. :(
I have not observed any crashes related to this particular issue.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12136>
There are two problems with the current architecture.
In OpenGL, the id is supposed to be a unique identifier for a particular
log source. This is done so that applications can (theoretically)
filter particular log messages. The debug callback infrastructure in
Mesa assigns a uniqe value when a value of 0 is passed in. This causes
the id to get set once to a unique value for each message.
By passing a stack variable that is initialized to 0 on every call,
every time the same message is logged, it will have a different id.
This isn't great, but it's also not catastrophic.
When threaded shader compiles are used, the id *pointer* is saved and
dereferenced at a possibly much later time on a possibly different
thread. This causes one thread to access the stack from a different
thread... and that stack frame might not be valid any more. :(
This fixes shader-db crashes of various kinds on Iris with threaded
shader compiles enabled.
Fixes: 42c34e1ac8 ("iris: Enable threaded shader compilation")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12136>
If the passed VkPipelineRasterizationLineStateCreateInfoEXT wasn't zero
initialized, we copy garbage values that are later on used to set the
state and may end up crashing when they are beyond the limits of the HW.
v2 (Lionel): Simplify if condition
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12121>
If we have 2 command buffers back to back, one with a query pool, one
without, we don't want to retain the second query pool value (NULL).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0a7224f3ff ("anv: group as many command buffers into a single execbuf")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12107>
All vulkan drivers have been copying anv's code to convert
VkSpecializationInfo into nir_spirv_specialization.
Recently there was a Vulkan spec change on allowed values for
VkSpecializationInfo, and all drivers got affected.
This commits creates a new helper, and uses it on all Vulkan Mesa
drivers.
v2: use (uint8_t*) castings, instead of void*, to avoid C2036 with
MSVC (detected by the CI, inspired on what radv was doing)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12047>
The current code we have for this is a bit of a mess, likely due to
trying too hard to put it in anv_android.c. The external_format bit in
anv_image, for instance, really means "quit creation early" which is
something we want to do for AHardwareBuffer imports regardless of
whether or not they use a native format. It gets set both by declaring
an AHardwareBuffer external handle type and by VkExternalFormatANDROID.
However, VkExternalFormatANDROID is only allowed for AHardwareBuffer
imports. If we ever did get an external format outside the context of
an AHardwareBuffer import, we would end up with a useless partially
created image.
When we detect an AHardwareBuffer import, we punt off to a function in
anv_android.c that does nothing interesting but call anv_create_image
with AUX disabled and external_format = true. The aux disable here is
useless because the actual isl_surf layout is done by resolve_ahw_image
which also sets ISL_SURF_USAGE_DISABLE_AUX_BIT. As far as external
formats go, anv_image_from_external() sets it regardless of whether or
not there is actually an external format.
This commit replaces anv_image::external_format with anv_image::from_ahb
which is the thing we actually want to track for this. We delete
anv_image_from_external and a bunch of the external_format handling
because it's all useless. The end result is massively simpler and,
while it appears to blur the boundary between Android code and the rest
of the driver, it makes the whole flow more obvious.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12040>
The only reason we had to refcount semaphores was for the ancient
sync_file semaphores which we used for pre-syncobj kernels. Now that we
assume syncobj and that code is gone, we don't need reference counting
anymore either.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9777>
Sync object for i915 support has been in upstream Linux since 4.14 which
is 3.5 years old at this point and, as far as we can tell, it also
exists in all the ChromeOS kernels. Assuming it allows us to drop some
of our more gnarly synchronization fall-back paths.
At the time of merge, ChromeOS was on the following kernels:
- kernel 3.18: SKL
- kernel 4.4: BYT, KBL, APL
- Kernel 4.14: BDW, GLK
All of the pre-4.14 kernels have had syncobj support back-ported.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9777>
Error handling with DRM_IOCTL_I915_QUERY is tricky and we got it wrong
in one of the two calls here. Use the common helper instead. This also
fixes a theoretical bug where calloc() fails. While we're here, inline
anv_track_meminfo because we're not really benefiting from having it
separate anymore.
Fixes: 65e8d72bc1 "anv: Query memory region info"
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
We also add a helper which contains the standard query+alloc+query
pattern used by anv_gem_get_engine_info(). The caller is required to
free the pointer.
These are declared static inline not because we care about the
performance of these helpers but because we're going to use them in the
intel_device_info code and we don't want a link dependency.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
DRM_IOCTL_I915_QUERY is a multi-query. The most egregious errors are
returned via the usual ioctl error mechanism but there are also
per-query errors that are indicated by item.length < 0. We need to
handle those as well. While we're at it, scrape errno so we can return
a proper integer error.
Fixes: c0d07c838a "anv: Support i915 query (DRM_IOCTL_I915_QUERY)..."
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
We can use a better algorithm from ICL and onward by setting a chicken
bit, but prior to that we need to resort to disabling rectangular lines.
Since we don't support strictLines anyway, this shouldn't be a major
issue.
Closes#2833
Fixes dEQP-VK.rasterization.interpolation_multisample_*_bit.*lines_wide
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11672>
This is distinct form max_cs_threads because it also encodes
restrictions about the way we use GPGPU/COMPUTE_WALKER. This gets rid
of the MIN2(64, devinfo->max_cs_threads) we have scattered all over the
driver and puts it in a central place.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11861>
Cherryview is weird in that the actual limits we can expose through GL
are dependent on fusing information which is only obtainable at runtime.
The same PCI ID may have different configurations with different maximum
CS thread counts. We currently handle this in i965 and ANV by doing the
calculation in the driver.
This dates back to when intel_device_info was computed from the PCI ID.
Now that we have get_device_info_from_fd, we can move the CHV stuff
there and get it out of the driver. This fixes CHV thread counts on
crocus as well.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11861>
We can actually create array surfaces instead of requiring single-slice
in a few cases. This does require us to be very careful about our
checks, though.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11647>
This adds a helper isl_surf_get_uncompressed_surf for creating a surface
which provides an uncompressed view into a compressed surface. The code
is basically a direct port of the uncompressed surface code from the
Vulkan driver which, in turn, was a port from BLORP.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11647>
The unused bo_flags here is a leftover from the past. A similar
setup of bo_flags is now performed within anv_device_alloc_bo
via a call to anv_bo_alloc_flags_to_bo_flags.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11645>
When vkCmdPushDescriptorSetKHR is used, the descriptor set is allocated
internally without belonging to any pool. Such descriptor set will be
visible on the GPU side because it's a part of the dynamic state stream,
but we still have to store its address in the array of descriptor sets.
Complements: 379b9bb7b0 ("anv: Support fetching descriptor addresses from push constants")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11577>
iris, i965, and anv define the PAGE_SIZE in anv_allocator and bufmgr
files. As on FreeBSD the page size is defined in machine/param.h that is
indirectly included by those files, we'd rather define it only when the
system is not FreeBSD to avoid compile errors.
v2: Changed the path in the comment to make clear that machine/params.h
is a FreeBSD system file.
Signed-off-by: Eleni Maria Stea <elene.mst@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11203>
In all cases both variables has a type of uint32_t, so multiplying
them will also generate uint32_t. The results of those multiplications
are used as uint64_t's, so Coverity thinks there might be integer
overflows here.
I don't think it's possible to hit them (query BOs should be relatively
small), but let's avoid those overflows.
CID: 1472820
CID: 1472821
CID: 1472822
CID: 1472824
CID: 1475934
CID: 1475927
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11574>
This is the first time we see an application running out of mmap().
We essentially allocate too many batches (+65k) and end up not being
able to mmap them, at which point we can't mmap anything anymore and
things go sideways.
This change allocates bigger batch BOs as we grow an existing command
buffer. This drastically reduces the number of BOs we need to allocate
(the benchmark that reported the issue now reaches a max of ~630 BOs,
instead of reaching 65k and failing previously).
v2: Track the total batch size of command buffers (Jason)
Just give 0 for batch_len to i915 (Jason)
v3: Fix indentation (Jason)
v4: Drop uncessary reshuffling of error labels (Jason)
v5: Remove empty lines (Marcin)
v6: Limit BO growing to chunks of 16Mb (Jason)
v7: Add assert on initial size (Jason)
v8: Add define for max size (Jason)
v9: Fixup v7 assert for non softpin platforms (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4956
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11482>
VkGraphicsPipelineCreateInfo.pMultisampleState is a pointer to a
VkPipelineMultisampleStateCreateInfo structure, and is ignored if the
pipeline has rasterization disabled.
Fixes a crash in one CTS tests that checks this.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11601>
Rework:
* Jordan: Handle per_thread_scratch==0 in anv_scratch_pool_get_surf
* Jordan: Update subslices in anv_scratch_pool_alloc
* Jason: Clean up the patch a bit
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>
v2 (Jordan Justin):
- add anv_gem_stubs.c impl
v3 (Jason Ekstrand):
- Use the upstream uAPI
- Rework the interface a bit
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599>
Create additional memory type with DEVICE_LOCAL_BIT if we have local
memory region aviable.
v2 (Jason Ekstrand):
- Don't leak mem_regions if the second ioctl fails
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599>