The code is correct, but compiler can't see it. Initialize the value
to NULL and assert on it if the function succeeds. It both helps the
compiler and make the code slightly more robust.
```
../src/intel/vulkan/anv_queue.c: In function ‘anv_QueueSubmit2KHR’:
../src/intel/vulkan/anv_queue.c:932:16: warning: ‘impl’ may be used uninitialized in this function [-Wmaybe-uninitialized]
932 | result = anv_queue_submit_add_timeline_wait(queue, submit,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
933 | &impl->timeline,
| ~~~~~~~~~~~~~~~~
934 | value);
| ~~~~~~
../src/intel/vulkan/anv_queue.c:899:31: note: ‘impl’ was declared here
899 | struct anv_semaphore_impl *impl;
| ^~~~
```
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13638>
We'd like to add safeguards against accidental use of MOCS 0 (uncached),
which can have large performance implications. One case where we use
MOCS of 0 is disabled stream output targets, MOCS shouldn't matter, as
there's no actual buffer to be cached.
That said, it should be harmless to set MOCS for these null stream
output buffers; we can just assume a MOCS for generic internal buffers.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We'd like to add safeguards against accidental use of MOCS 0 (uncached),
which can have large performance implications. One case where we use
MOCS of 0 is 3DSTATE_VERTEX_BUFFERS where we set NullVertexBuffer.
It shouldn't matter here, as there's no actual buffer to be cached.
That said, it should be harmless to set MOCS for null vertex buffers.
We can assume an internal buffer and request isl's vertex buffer MOCS.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
This avoids MOCS != 0 assertions in later patches. iris also does this,
and we do it for the 3DSTATE_CONSTANT_ALL packet path as well. It's a
bit pointless, but it should hopefully be harmless also.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
We were only setting this on Gfx9+. It's MBZ on Gfx8, but it exists
on Gfx7.x and doesn't have those restrictions there; we should set it.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
isl now uses info->mocs regardless of whether there's any actual
depth/stencil/HiZ buffers involved, so pass it a legitimate one,
rather than zero. When we have entirely NULL surfaces, we just
default to isl's MOCS value for an internal depth buffer.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13480>
While filling out surface state, pass correct aux usage for storage
images as we support compression on XeHPG.
v2: (Jason Ekstrand)
- Move assertion down a bit
- Use general layout aux usage
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/3606>
Tigerlake revision 0 is an early stepping that should not be used in
production anywhere, so this code was only used for hardware bringup.
We can drop the unnecessary workarounds. This also keeps them from
triggering on early steppings of other Gfx12 parts.
Reviewed-by: Francisco Jerez <currojerez@riseup.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13266>
Fix defect reported by Coverity Scan.
Assign instead of compare (PW.ASSIGN_WHERE_COMPARE_MEANT)
assign_where_compare_meant: use of "=" where "==" may have been intended
Fixes: 35315c68a5 ("anv: Use the common wrapper for GetPhysicalDeviceFormatProperties")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13395>
INTEL_DEBUG is defined (since 4015e1876a) as:
#define INTEL_DEBUG __builtin_expect(intel_debug, 0)
which unfortunately chops off upper 32 bits from intel_debug
on platforms where sizeof(long) != sizeof(uint64_t) because
__builtin_expect is defined only for the long type.
Fix this by changing the definition of INTEL_DEBUG to be function-like
macro with "flags" argument. New definition returns 0 or 1 when
any of the flags match.
Most of the changes in this commit were generated using:
for c in `git grep INTEL_DEBUG | grep "&" | grep -v i915 | awk -F: '{print $1}' | sort | uniq`; do
perl -pi -e "s/INTEL_DEBUG & ([A-Z0-9a-z_]+)/INTEL_DBG(\1)/" $c
perl -pi -e "s/INTEL_DEBUG & (\([A-Z0-9_ |]+\))/INTEL_DBG\1/" $c
done
but it didn't handle all cases and required minor cleanups (like removal
of round brackets which were not needed anymore).
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13334>
This sets the conformance version to 0.0.0.0 for GPUs that have
incomplete support for vulkan, so that it's easier to check if vulkan is
fully supported by a GPU at runtime for applications/libraries.
$ vulkaninfo|grep conf
MESA-INTEL: warning: Ivy Bridge Vulkan support is incomplete
conformanceVersion = 0.0.0.0
Signed-off-by: Clayton Craft <clayton@craftyguy.net>
Reviewed-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13275>
This patch adds Tessellation Distribution on top of Geometry
Distribution. Using recommended values based on performance studies
across a range of workloads.
Rework:
- Add comment for new packet bits (Sagar)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12091>
Using recommended values based on performance studies across a range
of workloads.
Rework:
* Always enable geometry distribution
* Set ListCutIndexEnable if primitive restart is enabled
* Set distribution mode based on TEEnable
* Add comment explaining the 3DSTATE_VFG bits (Sagar)
v2:
- Emit 3DSTATE_VFG dynamically based on primitive restart (Ken)
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12091>
Small typo/copy-paste.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: c0093c4668 ("anv: Flip around the way we reason about storage image lowering")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13332>
Otherwise we hit assert in vk_object_base_assert_valid when attemping to
create handle from anv_fence with unknown base type.
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13330>
Common queue submit expects pWaitDstStageMask to be set per each
semaphore (as per Vulkan spec) and crashes if these are not given
properly.
This fixes crashes seen when running vulkan apps on Android.
v2: change the VkPipelineStageFlags given (Lionel)
Fixes: b996fa8efa ("anv: implement VK_KHR_synchronization2")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13305>
Instead of doing a tile cache flush after slow clears, resolves, and
ambiguates, do it before fast clears of HIZ_CCS_WT surfaces. This agrees
with the Bspec.
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11539>
There are roughly two cases when it comes to storage images. In the
easy case, we have full hardware support and we can just emit a typed
read/write message in the shader and we're done. In the more complex
cases, we may need to fall back to a typed read with a different format
or even to a raw (SSBO) read.
The hardware has always had basically full support for typed writes all
the way back to Ivy Bridge but typed reads have been harder to come by.
Starting with Skylake, we finally have enough that we at least have a
format of the right bit size but not necessarily the right format so we
can use a typed read but may still have to do an int->unorm or similar
cast in the shader.
Previously, in ANV, we treated lowered images as the default and write-
only as a special case that we can optimize. This flips everything
around and treats the cases where we don't need to do any lowering as
the default "vanilla" case and treats the lowered case as special.
Importantly, this means that read-write access to surfaces where the
native format handles typed writes now use the same surface state as
write-only access and the only thing that uses the lowered surface state
is access read-write access with a format that doesn't support typed
reads. This has the added benefit that now, if someone does a read
without specifying a format, we can default to the vanilla surface and
it will work as long as it's a format that supports typed reads.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13198>
Description by Coverity:
"Unintentional integer overflow (OVERFLOW_BEFORE_WIDEN)
overflow_before_widen: Potentially overflowing expression 1 << b with type int
(32 bits, signed) is evaluated using 32-bit arithmetic, and then used in
a context that expects an expression of type VkAccessFlags2KHR (64 bits,
unsigned)"
CID: 1492745
CID: 1492748
Fixes: b996fa8efa ("anv: implement VK_KHR_synchronization2")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13284>
AHARDWAREBUFFER_USAGE_CAMERA_MASK enum is defined later and gets
included in the stub headers.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Ryan Neph <ryanneph@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13255>
Make u_vector_init a wrapper to u_vector_init_pot. Let both take
(element_count, element_size) as parameters.
Motivated by eed0fc4caf ("vulkan/wsi/wayland: fix an invalid
u_vector_init call")
v2: rename u_vector_init_pot to u_vector_init_pow2
Signed-off-by: Chia-I Wu <olvaffe@gmail.com>
Reviewed-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Engestrom <eric@engestrom.ch>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13201>
The VK_ERROR_FRAGMENTED_POOL and VK_ERROR_OUT_OF_POOL_MEMORY errors are
not as exceptional cases as most. These are expected to be hit by
applications in the normal course of doing their thing. Probably best
not to spam stderr and the debug logs with them.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13045>
Now that all spirv_to_nir() users take care of converting sysvals to
varyings, we can unconditionally declare FragCoord as a sysval.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13017>
This is an attempt at simplifying the spirv_to_nir() backend when it
comes to choosing between system values and input varyings. Let's patch
drivers to do the sysval to input varying conversion on their own so we
can get rid of the frag_coord_is_varying field in spirv_to_nir_options
and unconditionally create create sysvals for FragCoord, FrontFacing and
PointCoord inputs instead of adding new xxx_is_{sysval,varying} flags.
Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Hyunjun Ko <zzoon@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13017>
v2: Use u_foreach_bit64() (Samuel)
v3: Add missing handling of VkMemoryBarrier2KHR in pNext of
VkSubpassDependency2KHR (Samuel)
v4: Remove unused ANV_PIPELINE_STAGE_PIPELINED_BITS (Ivan)
v5: fix missing anv_measure_submit() (Jason)
constify anv_pipeline_stage_pipelined_bits (Jason)
v6: Split flushes & invalidation emissions on
vkCmdSetEvent2KHR()/vkCmdWaitEvents2KHR() (Jason)
v7: Only apply flushes once on events (Jason)
v8: Drop split flushes for this patch
v9: Add comment about ignore some fields of VkMemoryBarrier2 in
VkSubpassDependency2KHR (Jason)
Drop spurious PIPE_CONTROL change s/,/;/ (Jason)
v10: Fix build issue on Android (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9045>
New access flags & pipeline stages got added for transform feedback
and we missed handling them.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 36ee2fd61c ("anv: Implement the basic form of VK_EXT_transform_feedback")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9045>
When enabling a new feature we made the mistake of initializing some fields
of the device object conditionally, which leads to crashes later. Initializing
those fields would be a trivial fix, but it's probably better to just zero
everything at allocation time and prevent any future screwups. Device objects
are allocated rarely enough for this additional memset to not matter for
performance.
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13221>
We do, however, leave a nice tombstone comment in case anyone comes
looking. Given the age and scarcity of Cherry View hardware and ASTC
apps that run on desktop Linux, it's unlikely we'll ever bother to
implement it.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13206>
these were duplicated all over the place, and it's annoying to have to keep
duplicating them any time a new component includes the vulkan header
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13141>
v2 (Jason Ekstrand):
- Switch the order of arguments to be device, image, other stuff
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
This is similar to a patch from Lionel except works in terms of aspects
rather than bindings. This makes it easy to use from the Android code.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13199>
We're setting the same field a dozen lines below to the exact same
value.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13169>
In preparation for adding support for the graphics mesh pipeline,
identify all the paths that are specific the primitive pipeline.
This shouldn't change any behavior since the code currently only
supports the primitive pipeline.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13047>
So that we can use the active_stages in copy_non_dynamic_state() later.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13047>
This is already handled by vk_device_init(); drivers no longer
need to do it themselves.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12867>
There are exceptions in spec where the framebuffer image format and
format given for render pass attachment may differ. This happens in
particular when subpass has resolve attachment that might resolve
only depth from a combined depth+stencil format. There the formats do
not need to match but be 'compatible' with each other.
As example using VK_FORMAT_D32_SFLOAT format is considered compatible
when actual framebuffer format is VK_FORMAT_D32_SFLOAT_S8_UINT.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13082>
emit_slice_hashing_state gets called once per queue, meaning
device->slice_hash can get allocated multiple times. This can be
reproduced by setting the env-var ANV_QUEUE_OVERRIDE=gc=2.
Reworks:
* Only pack the struct once (s-b Lionel, Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13114>
Reworks:
* Set BLORP_BATCH_USE_COMPUTE in anv_blorp_batch_init (s-b Jason)
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
Reworks:
* Let blorp_clear handle DEBUG_BLOCS
* Old subject was: "Use compute blorp for vkCmdFillBuffer with
INTEL_DEBUG=blocs"
* Old subject was: "anv/blorp: Support params.cs_prog_data being set"
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11564>
This knowledge was repeated in multiple places so move the values to
intel_device_info struct.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13014>
Since only Anv uses the value, I'm only enabling this on anv.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 518693c3ec ("spirv: Handle the SubgroupSize execution mode")
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13034>
The populate_base_prog_key will set VARYING depending if the pipeline
flag is used. Later, when full subgroups flag is set, it will flip to
UNIFORM -- which for compute shaders is effectively the same, so don't
bother setting it again.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12946>
The lowering passes will soon be moved to another function, so there
won't be any choice.
As a side benefit, this allows eliminating the uses_atomic_load_store
**pointer** parameter from brw_nir_lower_storage_image. For some reason
crocus was passing false instead of NULL.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12858>
This consolidates several duplicated pieces of code into devinfo.
max_scratch_ids is an array that provides the max number of threads
for the rendering and compute stages.
This fixes some exceptions missed by crocus for scratch ids on haswell
and cherryview.
It also fills out devinfo->max_scratch_ids properly for stages VS
through CS on Gfx12.5. But, functionally this should not make a
difference as Gfx12.5 already uses COMPUTE for all stages.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12799>
By default, Mesa's X11 Vulkan WSI will wait for buffers to be ready
before submitting them to Xwayland when the swapchain is created
with the IMMEDIATE mode.
This is undesirable when the Wayland compositor already monitors
fences. A Wayland compositor may want to know the delay between
the buffer submition and the end of the GPU work, this is impossible
to measure if the WSI waits for the buffer to be ready before
submission.
Since most compositors don't monitor fences, let's introduce a driconf
option for this for now. We can reconsider once more compositors
have better support for fences.
Signed-off-by: Simon Ser <contact@emersion.fr>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11290>
With LSC support, we can do 64-bit atomics with A32/64 messages.
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Suggested-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12566>
For now, only mark the 4x8BitPacked variants as accelerated.
Applications are unlikely to use the "add with saturate" opcodes from
VK_INTEL_shader_integer_functions2, so, technically, all of the
AccumulatingSaturating variants "[provide] a performance advantage over
user-provided code composed from elementary instructions..." on all
Intel platforms. If we encounter an application that cares, we can do
things differently then. Ditto for the non-packed 8Bit, 4-element
vector variants.
v2: Don't memset props as this also zeros sType and pNext. Noticed by
Georg Lehmann in !12617.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12624>
This commit does several things:
* Unify code common to several drivers by evaluating INTEL_NO_HW within
intel_get_device_info_from_fd (suggested by Jordan).
* For drivers that keep a copy of the intel_device_info struct, a
separate copy of the no_hw field is now unnecessary. Remove them.
* Minimize kernel queries when INTEL_NO_HW is true. This is done for
code simplification, but we may find reason to undo this later on.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
The prior method of checking the result of getenv() for NULL would cause
the feature to be enabled for INTEL_NO_HW=0.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12007>
Implement the workarounds in anv and iris instead.
Before this commit, ISL unconditionally modified workaround registers
while filling out depth stencil state. To account for this, drivers
unconditionally stalled prior to emitting depth stencil packets. This
hurt performance.
By having the drivers perform the workarounds, they can choose when to
modify the relevant registers. The drivers now avoid emitting the
workaround for NULL depth buffers. This reduces stalls and leads to
better performance.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> (the ISL/Anv bits)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> (the Iris bits)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11454>
Instead of making LMEM the special case, unify the two paths by setting
up a fake drm_i915_query_memory_regions struct and filling it out based
on OS queries. The important functional change here is that we now pass
system memory through the same GTT size and 3/4 filter that we were
using with the OS queries. This should make behavior consistent on
integrated GPUs regardless of whether or not we have the memory region
query API.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12433>
Even though we can't really do the parsing on behalf of the driver (it's
too complicated), storing it in the vk_image lets us provide a common
implementation of vkGetImageDrmFormatModifierPropertiesEXT(). It'll
also be useful in the next few commits for swapchain images.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12023>
In theory, with linear vs. tiled differences, it could be different
(RGBA vs. RGB etc.) but it won't matter for the two checks we do with
it. Also, we probably want to be checking the real format here anyway.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12023>
This is mostly a bit of future-proofing. We never end up with offsets
that don't fit in 32 bits today because, thanks to driver limitations
caused by relocations, we don't allocate buffers bigger than 2GB today.
However, if we ever did, it's possible to create a surface on modern
platforms that consumes more than 4GB and we would end up with wrapping
in our offset calculations.
Acked-by: Ivan Briano <ivan.briano@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11765>
When creating an image out of a swapchain on Android, the android
layer call will detect a VkBindImageMemorySwapchainInfoKHR in the
pNext chain of the vkBindImageMemory2() call and add a
VkNativeBufferANDROID in the chain. This is what we should use as
backing memory for that image.
v2: Fix a couple of obvious mistakes (Tapani)
v3: Silence build warning (Lionel)
Fix invalid object argument to vk_error() (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: bc3c71b87a ("anv: don't try to access Android swapchains")
Cc: mesa-stable
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/5180
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12244>
It's called anv_image_* so it really should take an anv_image. For the
couple of cases where we really want to pass in a set of aspects, we
leave an anv_aspect_to_plane() helper. anv_image_aspect_to_plane() is
then just a wrapper around it which grabs the aspects from the image.
While we're in the area, sprinkle some const around.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
The new versions should have identical output, just a simpler (and
probably faster) implementation and more/better asserts.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
The Vulkan 1.2.184 spec says:
"When creating a VkImageView, if sampler Y′CBCR conversion is
enabled in the sampler, the aspectMask of a subresourceRange used by
the VkImageView must be VK_IMAGE_ASPECT_COLOR_BIT.
When creating a VkImageView, if sampler Y′CBCR conversion is not
enabled in the sampler and the image format is multi-planar, the
image must have been created with
VK_IMAGE_CREATE_MUTABLE_FORMAT_BIT, and the aspectMask of the
VkImageView’s subresourceRange must be VK_IMAGE_ASPECT_PLANE_0_BIT,
VK_IMAGE_ASPECT_PLANE_1_BIT or VK_IMAGE_ASPECT_PLANE_2_BIT."
Previously, for YCbCr images, we were flipping this around. For single-
plane views where VK_IMAGE_ASPECT_PLANE_N_BIT would be passed in by the
app, we would store VK_IMAGE_ASPECT_COLOR_BIT. For multi-plane views
where the client says VK_IMAGE_ASPECT_COLOR_BIT, we would store a all of
the planes. (There was also an extra bit of remapping that would
compact the planes in the non-existent case of a format with a non-
contiguous set of planes.) The idea behind this was that for things
like rendering or single-plane sampling, storage, or compute, we want it
to look as much like a single-plane image as possible but we wanted the
multi-plane case to be the awkward one.
This commit changes it around so that iview->aspects is always exactly
the subset of image->vk.aspects represented by the view. This is
identical to how aspects work for depth/stencil so it gains us some
consistency.
This commit also changes anv_image_view::aspect_mask to aspects to force
a full audit of the field. As can be seen, there are only a few uses of
this field and they're all mostly fine:
- A bunch of them are used to check for depth/stencil. That hasn't
changed.
- Most of the checks for color already used ANY_COLOR_BIT, only one
needed fixing.
- There's a check that both src/depth are color for MSAA resolves.
However, we don't support MSAA on YCbCr so there's no point in
checking for ANY_COLOR_BIT.
There is a hidden usage of planes in anv_descriptor_set_write_image_view
that's not as obvious. However, this function simply looks at
anv_image_view::n_planes and blindly fills out the descriptor
accordingly. As long as image views with a single plane continue to
claim n_planes == 1, this will be fine.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
Previously, we initialized vplane in anv_CreateImageView to 0 and
incremented it every iteration of the aspect loop. This only works
because planes are guaranteed to be in aspect-bit-order which wasn't
documented anywhere. Instead, drop this assumption and burn a couple
CPU cycles properly calculating vplane.
While we're here, make iplane const as well.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
When creating a single-plane view of a multi-plane image, we were
relying on vplane_aspect to be VK_IMAGE_ASPECT_COLOR_BIT so that
anv_get_format_plane of the single-plane view format would work.
Instead of relying on this quirk, we can drop vplane_aspect and rely
entirely on vplane to only be 0 in this case. In the case of depth or
stencil images, we still need to grab the format aspect but we can use
the actual aspect and don't need the vplane_aspect trickery.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
Once we get past depth/stencil, what we really want is plane 0 not the
color aspect. A bunch of those formats don't have a single color
aspect.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
Unlike anv_get_format_aspect, this takes a plane number which is
relative to the set of aspects on the format. There are a number of
cases where we already have the plane and so re-fetching it is useless.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
The comment about modifiers is bogus because we check the modifier
before this check and return early. Also, there's no reason why we need
to check the requested aspect when we could check the format itself.
anv_image_aspect_to_plane will ensure that the requested aspect is one
that actually exists.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
Vulkan allows us to, in theory, support ycbcr on single-plane formats if
the client really wants it. Also, these functions should work on a
multi-plane color image as long as the client specifies the right
aspect. This gets rid of our usage of can_ycbcr outside of anv_image.c.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12141>
This commit reverts 58e9371141 as now iris driver can import stencil.
This makes ext_external_objects-vk-stencil-display pass and X-Plane 11
vulkan rendering backend to work with anv + iris.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Eleni Maria Stea <elene.mst@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10609>
This makes import easier on different gfx generations and we don't
have to lock down on a certain aux layout just yet.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10609>
Add support for driconf overrides on a per-device level, for cases
where we don't want to override behavior for all devices supported
by a particular driver.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Emil Velikov <emil.l.velikov@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12135>
There are two problems with the current architecture.
In OpenGL, the id is supposed to be a unique identifier for a particular
log source. This is done so that applications can (theoretically)
filter particular log messages. The debug callback infrastructure in
Mesa assigns a uniqe value when a value of 0 is passed in. This causes
the id to get set once to a unique value for each message.
By passing a stack variable that is initialized to 0 on every call,
every time the same message is logged, it will have a different id.
This isn't great, but it's also not catastrophic.
When threaded shader compiles are used, the id *pointer* is saved and
dereferenced at a possibly much later time on a possibly different
thread. This causes one thread to access the stack from a different
thread... and that stack frame might not be valid any more. :(
I have not observed any crashes related to this particular issue.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12136>
There are two problems with the current architecture.
In OpenGL, the id is supposed to be a unique identifier for a particular
log source. This is done so that applications can (theoretically)
filter particular log messages. The debug callback infrastructure in
Mesa assigns a uniqe value when a value of 0 is passed in. This causes
the id to get set once to a unique value for each message.
By passing a stack variable that is initialized to 0 on every call,
every time the same message is logged, it will have a different id.
This isn't great, but it's also not catastrophic.
When threaded shader compiles are used, the id *pointer* is saved and
dereferenced at a possibly much later time on a possibly different
thread. This causes one thread to access the stack from a different
thread... and that stack frame might not be valid any more. :(
This fixes shader-db crashes of various kinds on Iris with threaded
shader compiles enabled.
Fixes: 42c34e1ac8 ("iris: Enable threaded shader compilation")
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12136>
If the passed VkPipelineRasterizationLineStateCreateInfoEXT wasn't zero
initialized, we copy garbage values that are later on used to set the
state and may end up crashing when they are beyond the limits of the HW.
v2 (Lionel): Simplify if condition
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12121>
If we have 2 command buffers back to back, one with a query pool, one
without, we don't want to retain the second query pool value (NULL).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0a7224f3ff ("anv: group as many command buffers into a single execbuf")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12107>
All vulkan drivers have been copying anv's code to convert
VkSpecializationInfo into nir_spirv_specialization.
Recently there was a Vulkan spec change on allowed values for
VkSpecializationInfo, and all drivers got affected.
This commits creates a new helper, and uses it on all Vulkan Mesa
drivers.
v2: use (uint8_t*) castings, instead of void*, to avoid C2036 with
MSVC (detected by the CI, inspired on what radv was doing)
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12047>
The current code we have for this is a bit of a mess, likely due to
trying too hard to put it in anv_android.c. The external_format bit in
anv_image, for instance, really means "quit creation early" which is
something we want to do for AHardwareBuffer imports regardless of
whether or not they use a native format. It gets set both by declaring
an AHardwareBuffer external handle type and by VkExternalFormatANDROID.
However, VkExternalFormatANDROID is only allowed for AHardwareBuffer
imports. If we ever did get an external format outside the context of
an AHardwareBuffer import, we would end up with a useless partially
created image.
When we detect an AHardwareBuffer import, we punt off to a function in
anv_android.c that does nothing interesting but call anv_create_image
with AUX disabled and external_format = true. The aux disable here is
useless because the actual isl_surf layout is done by resolve_ahw_image
which also sets ISL_SURF_USAGE_DISABLE_AUX_BIT. As far as external
formats go, anv_image_from_external() sets it regardless of whether or
not there is actually an external format.
This commit replaces anv_image::external_format with anv_image::from_ahb
which is the thing we actually want to track for this. We delete
anv_image_from_external and a bunch of the external_format handling
because it's all useless. The end result is massively simpler and,
while it appears to blur the boundary between Android code and the rest
of the driver, it makes the whole flow more obvious.
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12040>
The only reason we had to refcount semaphores was for the ancient
sync_file semaphores which we used for pre-syncobj kernels. Now that we
assume syncobj and that code is gone, we don't need reference counting
anymore either.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9777>
Sync object for i915 support has been in upstream Linux since 4.14 which
is 3.5 years old at this point and, as far as we can tell, it also
exists in all the ChromeOS kernels. Assuming it allows us to drop some
of our more gnarly synchronization fall-back paths.
At the time of merge, ChromeOS was on the following kernels:
- kernel 3.18: SKL
- kernel 4.4: BYT, KBL, APL
- Kernel 4.14: BDW, GLK
All of the pre-4.14 kernels have had syncobj support back-ported.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9777>
Error handling with DRM_IOCTL_I915_QUERY is tricky and we got it wrong
in one of the two calls here. Use the common helper instead. This also
fixes a theoretical bug where calloc() fails. While we're here, inline
anv_track_meminfo because we're not really benefiting from having it
separate anymore.
Fixes: 65e8d72bc1 "anv: Query memory region info"
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
We also add a helper which contains the standard query+alloc+query
pattern used by anv_gem_get_engine_info(). The caller is required to
free the pointer.
These are declared static inline not because we care about the
performance of these helpers but because we're going to use them in the
intel_device_info code and we don't want a link dependency.
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
DRM_IOCTL_I915_QUERY is a multi-query. The most egregious errors are
returned via the usual ioctl error mechanism but there are also
per-query errors that are indicated by item.length < 0. We need to
handle those as well. While we're at it, scrape errno so we can return
a proper integer error.
Fixes: c0d07c838a "anv: Support i915 query (DRM_IOCTL_I915_QUERY)..."
Reviewed-by: Paulo Zanoni <paulo.r.zanoni@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11770>
We can use a better algorithm from ICL and onward by setting a chicken
bit, but prior to that we need to resort to disabling rectangular lines.
Since we don't support strictLines anyway, this shouldn't be a major
issue.
Closes#2833
Fixes dEQP-VK.rasterization.interpolation_multisample_*_bit.*lines_wide
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11672>
This is distinct form max_cs_threads because it also encodes
restrictions about the way we use GPGPU/COMPUTE_WALKER. This gets rid
of the MIN2(64, devinfo->max_cs_threads) we have scattered all over the
driver and puts it in a central place.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11861>
Cherryview is weird in that the actual limits we can expose through GL
are dependent on fusing information which is only obtainable at runtime.
The same PCI ID may have different configurations with different maximum
CS thread counts. We currently handle this in i965 and ANV by doing the
calculation in the driver.
This dates back to when intel_device_info was computed from the PCI ID.
Now that we have get_device_info_from_fd, we can move the CHV stuff
there and get it out of the driver. This fixes CHV thread counts on
crocus as well.
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11861>
We can actually create array surfaces instead of requiring single-slice
in a few cases. This does require us to be very careful about our
checks, though.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11647>
This adds a helper isl_surf_get_uncompressed_surf for creating a surface
which provides an uncompressed view into a compressed surface. The code
is basically a direct port of the uncompressed surface code from the
Vulkan driver which, in turn, was a port from BLORP.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11647>
The unused bo_flags here is a leftover from the past. A similar
setup of bo_flags is now performed within anv_device_alloc_bo
via a call to anv_bo_alloc_flags_to_bo_flags.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11645>
When vkCmdPushDescriptorSetKHR is used, the descriptor set is allocated
internally without belonging to any pool. Such descriptor set will be
visible on the GPU side because it's a part of the dynamic state stream,
but we still have to store its address in the array of descriptor sets.
Complements: 379b9bb7b0 ("anv: Support fetching descriptor addresses from push constants")
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11577>
iris, i965, and anv define the PAGE_SIZE in anv_allocator and bufmgr
files. As on FreeBSD the page size is defined in machine/param.h that is
indirectly included by those files, we'd rather define it only when the
system is not FreeBSD to avoid compile errors.
v2: Changed the path in the comment to make clear that machine/params.h
is a FreeBSD system file.
Signed-off-by: Eleni Maria Stea <elene.mst@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11203>
In all cases both variables has a type of uint32_t, so multiplying
them will also generate uint32_t. The results of those multiplications
are used as uint64_t's, so Coverity thinks there might be integer
overflows here.
I don't think it's possible to hit them (query BOs should be relatively
small), but let's avoid those overflows.
CID: 1472820
CID: 1472821
CID: 1472822
CID: 1472824
CID: 1475934
CID: 1475927
Signed-off-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11574>
This is the first time we see an application running out of mmap().
We essentially allocate too many batches (+65k) and end up not being
able to mmap them, at which point we can't mmap anything anymore and
things go sideways.
This change allocates bigger batch BOs as we grow an existing command
buffer. This drastically reduces the number of BOs we need to allocate
(the benchmark that reported the issue now reaches a max of ~630 BOs,
instead of reaching 65k and failing previously).
v2: Track the total batch size of command buffers (Jason)
Just give 0 for batch_len to i915 (Jason)
v3: Fix indentation (Jason)
v4: Drop uncessary reshuffling of error labels (Jason)
v5: Remove empty lines (Marcin)
v6: Limit BO growing to chunks of 16Mb (Jason)
v7: Add assert on initial size (Jason)
v8: Add define for max size (Jason)
v9: Fixup v7 assert for non softpin platforms (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4956
Cc: mesa-stable
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11482>
VkGraphicsPipelineCreateInfo.pMultisampleState is a pointer to a
VkPipelineMultisampleStateCreateInfo structure, and is ignored if the
pipeline has rasterization disabled.
Fixes a crash in one CTS tests that checks this.
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11601>
Rework:
* Jordan: Handle per_thread_scratch==0 in anv_scratch_pool_get_surf
* Jordan: Update subslices in anv_scratch_pool_alloc
* Jason: Clean up the patch a bit
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11582>
v2 (Jordan Justin):
- add anv_gem_stubs.c impl
v3 (Jason Ekstrand):
- Use the upstream uAPI
- Rework the interface a bit
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599>
Create additional memory type with DEVICE_LOCAL_BIT if we have local
memory region aviable.
v2 (Jason Ekstrand):
- Don't leak mem_regions if the second ioctl fails
Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5599>
This gets rid of the awkward interface for isl_surf_get_ccs_surf where
we passed it two aux surfaces and it was supposed to fill out the second
one based on whether or not the first one already had stuff in it.
Instead, we now pass it three well-labled surfaces: surf,
hiz_or_mcs_surf, and ccs_surf which have obvious meanings. This does
mean that iris has to carry a bit of logic and we have to flip
parameters around in all the callers. But the resulting interface is
much cleaner.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11479>
Whether or not a surface supports CCS on Tigerlake and later is
dependent not only on the main surface but also on the MCS or HiZ
surface, if any. We were doing some of these checks in
isl_get_ccs_surf based on the extra_aux parameter but not as many as we
probably should. In particular, we were really only checking HiZ
conditions and nothing for MCS. It also meant that, in spite of the
symmetry in names, the checks in isl_surf_get_ccs_surf were more
complete than in isl_surf_supports_ccs.
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11479>
Initial implementation missed various fields that derive from the
primitive topology. This patch fixes 3DSTATE_RASTER/3DSTATE_SF,
3DSTATE_CLIP and 3DSTATE_WM (gen7.x) emission in the dynamic case.
Fixes: f6fa4a8000 ("anv: add support for dynamic primitive topology change")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4924
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11379>
Validation layers should warn you about this
(VUID-VkBindBufferMemoryInfo-size-01037) but this would be useful for
zink debugging.
Requested by Zmike.
v2: Also check memoryOffset (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11542>
v2: Turn a bunch of pointer checks into checks against NULL (Jason)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
This allows us to convert a 64-bit address to an anv_address which is
useful for working with device addresses.
v2: switch to int64_t to keep state pool relative relocation working
on non-softpin platforms
v3: Update assert to reflect relative offsets (Jason)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
This is required in order to be able to use GenXML pack functions for
structs with addresses when you're not packing into a batch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
These don't necessarily go in any group but are required for dispatch to
work properly. The trampoline is a compute shader that is the initial
start point for the trace. It's in charge of invoking the actual
ray-gen shader. The trivial return shader is used whenever another
shader is missing and it does no work except the minimum required to do
a stack return.
v2: Rebase on upstream changes (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
This doesn't look too different from other compile functions we have in
anv_pipeline.c. The primary difference is that ray-tracing pipelines
have this weird two-stage thing where you have "stages" which are
individual shaders and "groups" which are sort of mini pipelines that
are used to handle hits. For any given ray intersection, only the hit
and intersection shaders from the same group get used together. You
can't have an intersection shader from group A used with an any-hit from
group B. This results in a weird two-step compile.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
Bindless shaders don't have binding tables so they have to get at the
descriptor sets via a different mechanism.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
They don't have binding tables so they have to use A64 descriptor set
access and everything has to be bindless all the time.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
They're common between the two drivers and we want to add a couple more
that get emitted from code in src/intel/compiler.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
This just adds the core data structure which we'll build on going
forward.
v2: Add VK_EXT_pipeline_creation_cache_control handling (Lionel)
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
This makes a bunch of loops use ARRAY_SIZE instead of MESA_SHADER_STAGES,
extends a few arrays, and adds a bunch of array length asserts.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
This just adds a base struct and trivial implementations of all the
create/destroy/bind functions.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8637>
This makes dEQP-VK.api.version_check.entry_points pass and matches how
other drivers are handling this case. We do not support the feature but
still need to provide a dummy entrypoint.
v2: throw error if/when called (Jason)
Fixes: 0d031d1da3 ("anv: toggle on VK_EXT_extended_dynamic_state2")
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11503>
For debug on Android, it's useful to be able to print shaders to the
android log interface, since you don't usually have stdout/stderr.
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9262>
This patch only enables the below VkFormat:
- VK_FORMAT_G8_B8R8_2PLANE_420_UNORM
This patch ensures the proper behavior of the below APIs:
- vkGetPhysicalDeviceFormatProperties2
- vkGetPhysicalDeviceImageFormatProperties2
- vkCreateImage
- vkGetImageSubresourceLayout
- vkGetImageDrmFormatModifierPropertiesEXT
- vkGetImageMemoryRequirements
- vkGetImageMemoryRequirements2
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chad Versace <chad@kiwitree.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11281>
Add initial multi-planar format support on the images with modifiers:
- With aux usage,
- Format plane count must be 1.
- Memory plane count must be 2.
- Without aux usage,
- Each format plane must map to a distinct memory plane.
For the other cases, currently there is no way to properly map memory
planes to format planes and aux planes due to the lack of defined ABI
for external multi-planar images.
This patch doesn't include some potentially supported cases like all
format planes mapping to a single memory plane, additional refactoring
is needed to workaround explicit base offset + ANV_OFFSET_IMPLICIT.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Chad Versace <chad@kiwitree.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11281>
Commit 24342e499b changed how primitive
topology is handled on Gfx8+ but missed updating the Gfx7.x code.
As a result, tests which previously used topologies like PATCHLIST_3
instead started using bogus ones like LINESTRIP_ADJ. This caused a
GPU hangs in a bunch of Vulkan conformance tests involving tessellation.
This fixes those hangs.
Fixes: 24342e499b ("anv: fix dynamic primitive topology for tess")
Reviewed-by: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11434>
This has two steps. First, for each range we look at the memory object
and see if it actually needs flushing before we start throwing CLFLUSH
instructions. Second, we look at the whole list of types on device
initialization and decide whether or not we need CLFLUSH at all. The
first part should speed up atom chips a bit since we're currently
CLFLUSHing everything even when we don't need to. The second isn't
needed on most of today's parts because we base it on !has_llc but it is
needed for discrete parts. It's also over-all cleaner.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Tested-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11364>
HDC Pipeline Flush is the correct method for flushing HDC
pipeline on Gfx12+ HW. Continue using DC Flush for earlier HW.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9834>
Tile Cache flush flushes all Color/Depth values from L3 cache
to memory in Unified Cache mode. This is only required when
CPU access is required.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9834>
On Gfx12+, flushing tile cache ensures color/depth values are
globally visible, but that's expensive. Most operations only
need values to be GT-visible which can be achieved with depth
or rt flush. Remove a bunch of unnecessary Tile Cache flushes.
Fast clears and slow depth clears still require Tile Cache flush.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9834>
This should drop the CPU overhead of processing buffers on SKL+ by
dropping some of the logic contained in anv_reloc_list_add() whenever we
have enough compile-time information to know we have softpin.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11236>
The relocation list currently serves two purposes. One is for
relocations on older non-softpin platforms. The second is to keep track
of driver-managed BOs which are used by the given command buffer. We
going to need a mechanism to add BOs to the command buffer without doing
a relocation into the batch.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11236>
Whenever we have the GFX_VERx10 macro available, we can make use_softpin
a compile-time thing for everything but Broadwell and Cherryview. This
should save us some CPU cycles especially on SKL+.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11236>
Softpin was added to i915 in
commit 506a8e87d8d2746b9e9d2433503fe237c54e4750
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Tue Dec 8 11:55:07 2015 +0000
drm/i915: Add soft-pinning API for execbuffer
which was included in Linux 4.5. It's been over 5 years so it's
probably reasonable to make it a hard requirement.
Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11236>
Mesh and Task shaders can use workgroup memory, so generalize its
handling in anv by moving it from anv_pipeline_compile_cs() to
anv_pipeline_lower_nir().
Update Pipeline Statistics accordingly.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11230>
Move it out the "cs" sub-struct, since the bit will be used for other
shader stages in the future.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11225>
Be consistent with other usages in Vulkan and SPIR-V, and the recently
added workgroup_size field.
Acked-by: Emma Anholt <emma@anholt.net>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Acked-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11190>
This avoids multiple copies as we will need this in multiple places.
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10366>
A successful AHardwareBuffer_allocate itself will increase a refcount on
the newly allocated AHB. For the import case, the implementation must
acquire a reference on the AHB. So if we layer the exportable allocation
on top of AHB allocation and AHB import, we must release an AHB
reference to avoid leak.
Signed-off-by: Yiwei Zhang <zzyiwei@chromium.org>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10940>
The vec4 back-end can't push UBOs just yet but it soon will be able.
When it starts pushing UBOs, it will have a lower limit than scalar due
to a crummy register allocator. Mirror that limit in ANV so we don't
run into asserts due to ANV and the back-end making different choices.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10571>
We've only enabled the extension on Gfx11+ so no need to care about
prior values.
Also fixup values of (min|max)FragmentShadingRateAttachmentTexelSize.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 231651fd89 ("anv: implement VK_KHR_fragment_shading_rate")
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10607>
A bunch of performance counters rely on register snapshots on top of
the OA reports. Those are already conditional to the query mode in the
equations :
availability="true $QueryMode &&"
This change allows to disable counters that are only available with
additional register snapshots. This will be useful if you only want to
OA reports to extract performance counter values.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Acked-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10216>
This fixes some new cts tests that exercise blitting
between compressed and uncompressed formats.
Cc: mesa-stable
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10830>
We've only considered the perf query pool change previously. But we
also need to pay attention to the pass index.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: 0a7224f3ff ("anv: group as many command buffers into a single execbuf")
Reviewed-by: Felix DeGrood <felix.j.degrood@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10301>
In particular, this gives us B8G8R8A8_UNORM storage support which is
useful for writing WSI images from compute shaders. These formats can
only be accessed in a spec-compliant way by decorating the variable
NonReadable in the SPIR-V (writeonly in GLSL). If the client doesn't so
decorate the variable, it'll get the null surface state where reads
return 0 and writes are ignored.
Tested-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10624>
Move it out of the "cs" sub-struct, since the bit can be used for
other shader stages in the future.
This also removes a subtle issue in spirv_to_nir:
info.cs.shared_memory_explicit_layout was used without checking for
the CS shader stage. It ended up being "harmless" since the effects
also depended on presence of shared variables.
Fixes: 5de6c5973a ("spirv: Implement SPV_KHR_workgroup_memory_explicit_layout")
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10529>
And since right_mask is already provided as part of dispatch_info,
just use that instead of storing it.
Reviewed-by: Marcin Ślusarz <marcin.slusarz@intel.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10504>
Available on Gen11+.
v2: Order shading rate in correct order (Samuel)
v3: Move CPS_STATE emission to genX_state.c
v4: Don't override various output structures (Jason)
v5: Rebase on top master (Lionel)
v6: Fix invalid VkPhysicalDeviceFragmentShadingRatePropertiesKHR
(min|max)FragmentShadingRateAttachmentTexelSize values (Ken)
Drop #endif comment
v7: Limit extension to Gfx11+ (Lionel)
Support conservative raster (Lionel)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7455>
This makes the vertex order of TRISTRIP and TRISTRIP_ADJ primitves
consistent between XFB output and GS input. Technically, the Vulkan
spec allows us to XFB out in whatever order we want but being consistent
with GS inputs is probably nicer to apps.
Fixes: 36ee2fd61c "anv: Implement the basic form of VK_EXT_transform_feedback"
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10460>
Namely we want to be able to emit the following dynamically :
* On Gfx 7/7.5 : 3DSTATE_VM, 3DSTATE_BLEND_STATE_POINTERS
* On Gfx 8+ : 3DSTATE_VM, 3DSTATE_BLEND_STATE_POINTERS,
3DSTATE_PS_BLEND
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10206>