This patch renames all macros with "GEN_" prefix defined in
common code.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
This patch renames functions, structures, enums etc. with "gen_"
prefix defined in common code.
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
Changes in this patch include:
- Rename all files in src/intel/common path
- Update the filenames used in source and build files
Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com>
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9413>
- no 3D and cube textures
- no mipmapping
- no border color
- image_sample is the only supported opcode with a sampler (behaves like _lz)
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>
Signed-off-by: James Zhu <James.Zhu@amd.com>
Reviewed-by: Leo Liu <leo.liu@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9389>
Otherwise, we won't be able to use OPC_GETBUF to get their size.
After this change we also could get rid of the hack for OPC_GETSIZE
which scaled the size for texture buffers.
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9391>
Mesa fixed pipeline texture loading on programmable pipeline hardware emits
a generic fragment shader program which contains gl_TexCoord.xyzw as a vec4
and then expects to configure the varying assignments to the shader in the
pipeline command stream, to select what is wired to the XYZW fragment shader
inputs.
This gl_TexCoord.xyzw is turned into texture load with projection (TGSI TXP
opcode, similar for NIR). Texture load with projection does not exist in the
Vivante GPU as a dedicated opcode and is emulated. The shader program first
divides texture coordinates XYZ by projector W and then applies regular TEX
opcode to load the texture (i.e. TEX(gl_TexCoord.xyzw/gl_TexCoord.wwww)).
For point sprites, XY are the point coordinates from VS, Z=0 and W=1, always.
The Vivante GPU can only configure varying to be either of -- point coord X,
point coord Y, used, unused -- which covers XYZ, but not W. Z is fine because
unused means 0.
W used to be 0 too before this patch and that led to division by 0 in shader.
The only known way to solve this is to set Z=0, W=1 in the shader program
itself if the point sprites are enabled. This means we have to generate a
special shader variant which does extra SET to set the W=1 in case the point
sprites are enabled.
In case of TGSI, emitting the SET.TRUE opcode permits setting W=1 without
allocating additional constants. With NIR, use nir_lower_texcoord_replace()
to lower TEXn to PNTC, which sets Z=0, W=1, and let NIR optimize the shader.
Note that nir_lower_texcoord_replace() must be called before input linking
is set up, as it might add new FS input.
Also note that it should be possible to simply drop PIPE_CAP_POINT_SPRITE
in the long run, ST would then apply the same optimization pass, but that
option is so far misbehaving. And for etnaviv TGSI this is not applicable
yet.
This fixes neverball point sprites (exit cylinder stars) and eglretrace of
gl4es pointsprite test:
https://github.com/ptitSeb/gl4es/blob/master/traces/pointsprite.tgz
Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8618>
the maximum allowable runtime version of vk can be computed by MIN(instance_version, device_version)
despite this, instances and devices can be created using the maximum version available
for each respective type. the restriction is applied only at the point of
enabling/applying features and extensions, meaning that to correctly handle this,
zink must:
1. create an instance using the maximum allowable version
2. select a physical device using the instance
3. compute MIN(instance_version, device_version)
4. only now begin to enable/use features requiring vk 1.1+
ref #4392
Reviewed-by: Adam Jackson <ajax@redhat.com>
Acked-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9479>
With this command implemented messages emitted by
applications via glDebugMessageInsert will be forwarded
to the host.
v2: - remove check for feature in encode function, this
is covered in the state tracker (Rohan)
- reorder parameters in the encode function to the
order of the emit callback
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Reviewed-by: Rohan Garg <rohan.garg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9433>
nir_lower_idiv(..) creates during its lowering isub instructions.
Move nir_lower_idiv(..) before the opt loop to have a chance to
optimize/lower isub away. Also drop the drop the halti dependency to
make it easier to follow.
This fixes the following assert on GC3000:
Unhandled ALU op: isub
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9447>
the full-variable outputs can be skipped, leaving only the varyings which
actually need explicit emission due to packed layouts or whatever
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
if the number of explicit xfb outputs or new varyings added to the existing size
of the slot map would cause an overflow, we have to force a new slot map to
ensure that everything fits
this means iterating all the stages which can produce new varyings and calculating
all the slots required in order to compare against the max size available
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
if an entire variable is being dumped into an xfb buffer, there's no need
to create an explicit xfb variable to copy the value into, and instead
the xfb attributes can just be set normally on the variable
this doesn't work for geometry shaders because outputs are per-vertex
fixes all KHR-GL46.enhanced_layouts xfb tests
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
running nir_lower_io_arrays_to_elements_no_indirects for only some stages
breaks location-setting for the stages which don't run it when
e.g., dmat2x3 variables are sometimes split across locations and
sometimes jammed into a single location (TCS I'm looking at you)
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
if we get some crazy matrix types in here then we need to ensure that
we accurately unwrap them and copy the components
fixes KHR-GL46.enhanced_layouts.xfb_stride
Fixes: 1b130c42b8 ("zink: implement streamout and xfb handling in ntv")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
this was added during review, but it was never correct and just crashes
valid cases like streamout from a mat3x4 type
Fixes: b6f8f3a3ba ("zink: fix streamout for clipdistance")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9271>
This replaces the new_src parameter of nir_ssa_def_rewrite_uses_after()
with an SSA def, and rewrites all the users as needed.
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
This commit replaces the new_src parameter of nir_ssa_def_rewrite_uses()
with an SSA def, removes nir_ssa_def_rewrite_uses_ssa(), and rewrites
all the users as needed.
Reviewed-by: Rhys Perry <pendingchaos02@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa@collabora.com>
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9383>
While we're here, add __gen_get_batch_address declarations to more files
because we're about to start requiring it on all GFX 12.5+.
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9445>
We should only pass in a new indirect_info object if we actually set valid
values in it.
Fixes: abe8ef862f "gallium: make pipe_draw_indirect_info * a draw_vbo parameter"
Signed-off-by: Karol Herbst <kherbst@redhat.com>
Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9425>
Use debug_printf more consistently, normalize formatting a bit, and
trace a few more places you're likely to care about.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9436>
With these changes the shader code is visible in RGP.
Vk pipeline feature is emulated using si_update_shaders: when shaders are
updated we compute a sha1 of their code and use it as a pipeline hash.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9277>
Use atomic operations to avoid competition. In addition,
since sub_ctx_id 0 has been used by default, sub_ctx_id
should start from 1.
Signed-off-by: Xin He <hexin.op@bytedance.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9406>
This implements most of the remaining u_threaded_context support. Most
of the heavy lifting was done in the previous patches which fixed things
up for the new thread safety requirements. Only a few things remain.
u_threaded_context support can be disabled via an environment variable:
GALLIUM_THREAD=0
On Felix's Tigerlake with the GPU at fixed frequency, enabling
u_threaded_context improves performance of several games:
- Civilization VI: +17%
- Shadow of Mordor: +6%
- Bioshock Infinite +6%
- Xonotic: +6%
Various microbenchmarks improve substantially as well:
- GfxBench5 gl_driver2: +58%
- SynMark2 OglBatch6: +54%
- Piglit drawoverhead: +25%
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
pipe->transfer_map can be called from u_threaded_context's thread
rather than the driver thread. We need to use two different slab
allocators, one for each thread. transfer_unmap, on the other hand,
is only ever called from the driver thread.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
u_threaded_context requires various objects to inherit from a new
threaded_foo base class rather than directly from pipe_foo. This
patch does most of the mechanical changes required for that.
It also initializes the new threaded_resource fields.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
When we enable u_threaded_context, the pipe->create_*_state hooks
(precompile variants) are going to be called from one thread, while
iris_update_compiled_shaders (on-the-fly variants) are going to be
called from a driver thread. BLORP shaders also happen from
clear, blit, and so on in the driver thread.
u_upload_mgr isn't thread-safe, so use an uploader for each purpose.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
This enables us to replace the backing storage of resources that have
been used as stream output targets, in case we're invalidating their
entire contents. This can avoid stalls. We simply hadn't supported it
because it was going to be tricky to re-emit 3DSTATE_SO_BUFFER without
screwing up "reset offset to zero" vs. "keep appending". But that
should be working fine with the previous patch's refactor.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
The previous mechanism was a bit fragile. We stored the zero offset
in the pre-baked packet, and used an flag to override 0xFFFFFFFF
(append) offsets until our first emit - then prohibited anyone from
trying to re-emit the packet by flagging IRIS_DIRTY_SO_BUFFERS,
because that would re-emit the version with the zeroing of the offset.
Now, we always store 0xFFFFFFFF in the pre-baked packet, and use a
flag to override it to zero on the first emit. That way, we can
re-emit that packet at any time, and it'll just keep appending.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
In the future, Marek is planning to make u_threaded_context call
create_stream_output_target() from a different thread than the main
driver thread, which means that we can't safely use uploaders there.
To prepare for this eventual future, just defer the allocation of
the offset BO 'til later. It's a very small amount of overhead.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
With u_threaded_context, create_surface and create_sampler_view will
be called from a different thread than the driver thread. They aren't
allowed to access the context, which means that they can't use the
uploaders there to upload our SURFACE_STATE entries.
Thanks to backing-storage replacement and iris_rebind_buffer, we already
reworked things to maintain CPU-side copies of the SURFACE_STATE entries
and added the ability to upload or re-upload them later. So we can skip
the upload at object creation time, and add a simple resource-is-NULL
check at binding table upload time to ensure that they get uploaded by
the time we need them. (They might get uploaded earlier due to rebinds
or clear color updates, but this is the last moment to do so.)
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8964>
It shouldn't affect bound program state, and the current context state
shouldn't be relevant for shader creation precompiles anyway (level load
isn't going to have the eventual set of sampler views bound when you go to
draw with that shader).
Closes: #4306
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9089>
You always have to populate the key with the right texture swizzles, even
if textures haven't changed since binding a new shader.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9089>
We can compose the swizzles at sampler view creation time, saving
recompiles on texture format changes.
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9089>
This allows for virgl guests to expose GL_NVX_gpu_memory_info and
GL_ATI_meminfo when the extensions are supported on the host.
Signed-off-by: Rohan Garg <rohan.garg@collabora.com>
Reviewed-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9337>
mi_ is already a unique prefix in Mesa so the gen_ isn't really gaining
us anything except extra characters. It's possible that MI_ may
conflict a tiny bit with GenXML but it doesn't seem to be a problem
today and we can deal with that in the future if it's ever an issue.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9393>
With threaded-context we won't have a chance to apply the workaround in
the backend driver. But the previous commit moves it to a driconf
configured workaround in mesa/st, so we can drop this now.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9316>
All the ir3 gens had the same thing, time to move it out into a shared
helper.
The keeping the storage in fdN_context is to avoid namespace clashes
between ir3 and ir2.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9394>
Available since Vulkan 1.0, and in fact already wired up, just not
advertised. It looks like we could make this dynamic state but this
works for now.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9371>
Some games declare the wrong format, so we might want to disable this
optimization in that case.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Fixes: e4d75c22 ("nir/opt_shrink_vectors: shrink image stores using the format")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9229>
DROPPED_CNTR isn't reliable and might still report non-zero if the
SQTT buffer isn't full. Checking if the number of written bytes by
the hw is equal to the SQTT buffer size seems reliable.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9367>
both bits will have been flagged at this point in order to indicate
that the aspects will be cleared "at some point" during the loop, but
when actually iterating through the pending clears, only the bits set
in the clear call should be applied
Fixes: 5c629e9ff2 ("zink: defer pipe_context::clear calls when not currently in a renderpass")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9366>
The code improvement is limited and it interferes with using literals
directly in LDS index ops, since here source modifiers are not
supported, but the current assembler code might inject the modifiers.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>
The backend code was actually assuming this, but the lowering still set
the components and write masks like it would be honoured.
Signed-off-by: Gert Wollny <gert.wollny@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9330>
EXT_external_objects require we call glSignalSemaphoreEXT followed by a
glFlush. If the rendering workload is small when Signal and Flush
take place the relevant batch buffers with the actual rendering might
have been submitted already. In that case the following condition is met:
(iris_batch_bytes_used(batch) == 0). This causes:
glFlush() --> iris_fence_flush() -> iris_batch_flush() ->
_iris_batch_flush() to no-op, and so the fence doesn't get submitted to the
kernel. Then when anv tries to submit an execuf2 that must wait on the
shared VkSempahore / drm_syncobj fence, there isn't one and the kernel
rejects the batchbuffer causing an -EINVAL return of the execbuf2 ioctl
and a VK_DEVICE_LOST error. Empty batch buffers do have typically one
fence attached, but the ones carrying the extra fence from a
glSignalSempahore() call do have at least 2.
See also: the discussion in MR!4337.
v2: Changed the batch struct to have a contains_fence_signal variable
that is set to true when i915_EXEC_FENCE_SIGNAL fence is added to the
batch and off when batch is reset (Tapani Pälli)
Authored-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Reported-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Tested-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Signed-off-by: Eleni Maria Stea <elene.mst@gmail.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Reviewed-by: Mario Kleiner <mario.kleiner.de@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8861>
If zink is meant to work against Vulkan 1.0 API then it
should expose the 1.0 API as create time as well as always
ask for all the vulkan 1.0 extensions.
Reviewed-by: Hoe Hao Cheng <haochengho12907@gmail.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9075>
RuneScape ends up hitting this path, and it's easy enough to get
some well-defined behavior instead of a crash.
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-By: Bill Kristiansen <billkris@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9342>
A sequence like:
1) create sampler view CSO with UBWC resource
2) later create another sampler view or image view with the same
resource, but a format that triggers demoting the resource to
uncompressed
3) bind CSO created in step #1
would not work correctly, because the CSO created in step #1 is still
setup for UBWC, despite the fact that the resource had been demoted to
uncompressed.
Fortunately this is easy enough to detect, as the resource's seqno would
have been updated.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9321>
this is only tracking memory used by resources referenced in the batch, but it
can be adjusted a bit if we see that we're flushing too often
fixes spec@!opengl 1.1@streaming-texture-leak hogging all system memory and ooming
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9274>
we want to be able to track this so we can check whether a given batch is
going wild with memory usage for resources that might be pending free once
the batch finishes
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9274>
The number of shader engines isn't always 4.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9307>
When format modifiers are supported, use
resource_create_with_modifiers instead of resource_create. This
allows radeonsi to set the modifier field, and allows VA-API
clients to have a proper modifier instead of
DRM_FORMAT_MOD_INVALID.
Signed-off-by: Simon Ser <contact@emersion.fr>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9308>
We have to reserve at lease 16 program parameters in storage to
avoid its reallocation.
v2: move allocation to `st_deserialise_ir_program` and add helper for that
( Eric Anholt <eric@anholt.net> )
v3 amend comments a bit
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4352
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Signed-off-by: Andrii Simiklit <andrii.simiklit@globallogic.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9282>
Test a full transformation path (load_uniform -> load_ubo -> load_uniform)
and validate the load_uniform offset.
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9305>
The restoring of the acutal uniform offset was wrong.
Fixes: 1837135f7c ("etnaviv: nir: add ubo lowering pass")
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Acked-by: Lucas Stach <l.stach@pengutronix.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9305>
This is a little bit contorted because the Z storage for the tile is
either float or int depending on the Z format, so we have to be careful
about types when comparing.
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9287>
In release builds, there should be no change, but in debug builds the
assert will help us catch undefined behavior resulting from using
util_cpu_caps before it is initialized.
With fix for u_half_test for MSVC from Jesse Natalie squashed in.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9266>
this ends up being a tradeoff where we waste a little startup time and
an extra ~4k memory for the overall screen object in exchange for never having
to fetch format properties again, which is a surprisingly expensive call
to be making as much as we have to make it
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9293>
Allow ctx to be NULL in perf_debug_ctx() and make perf_debug() a
shortcut for perf_debug_ctx(NULL, ...) to simplify things slightly
in the next patch.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9264>
This was meant to be an || rather than &&, although it didn't matter for
shaderdb because both conditions would be true. But it did matter if
you were trying to force synchronous compile to avoid having nir/ir3
prints interleaved from multiple threads.
While at it, add a more specific debug flag to force initial variant
compile to be synchronous, because at some point the 'shaderdb' flag
itself will not force this.
Fixes: 75b0c4b5e1 ("freedreno/ir3: Async shader compile")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9264>
We'd like to use one Mesa build environment which builds our CL compiler
stack (which needs Clang/LLVM) and which builds our GL driver. The GL
driver doesn't really need LLVM support, and since we're statically
linking LLVM, removing it from the driver drastically reduces our DLL
size on disk.
Acked-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9259>
Need to import client buffer to display drm device, otherwise
get following xserver error log:
[ 190.982] (WW) modeset(0): Page flip failed: No such file or directory
[ 190.982] (EE) modeset(0): present flip failed
With this fix, full screen x11 client can display its window
buffer directly without a copy. Tested on Allwinner H3, 1080p
full screen glxgears go from 163FPS to 173FPS.
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Reviewed-by: Erico Nunes <nunes.erico@gmail.com>
Signed-off-by: Zan Dobersek <zdobersek@igalia.com>
Signed-off-by: Qiang Yu <yuq825@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9038>
we're going to want to make sure all other resources have been handled
at this point so that we can make some better decisions in this block
based on descriptor usage
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9273>
Fix defect reported by Coverity Scan.
Structurally dead code (UNREACHABLE)
unreachable: This code cannot be reached: return progress;
Fixes: d550c5780f ("zink: use nir_shader_instructions_pass for draw params pass")
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9182>
Fix defects reported by Coverity Scan.
uninit_member: Non-static class member prog is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member insn is not initialized in this constructor nor in any functions that it calls.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9183>
these flushes queue the fence to be submitted on the next finish call, so
we have to store the context to the fence to be able to handle threaded calls
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9243>
no need to submit command buffers to the queue which have no commands, so we
can just proceed to give back an inactive fence object that always returns success
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9243>
Extract batch destruction in its own function, then make sure to
release objects.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9243>
previously we would be creating a new fence per batch for every submit,
but we can just reset and reuse them if we're careful with it
based on original patch by Antonio Caggiano
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9243>
This is pretty much derived from what works and passes
both GL via zink and VK-GL-CTS test suites.
The zink multisample regressions tests rendered black, so
"passed" before, now they render more junk.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9260>
The GL frontend can do it for us now, so just use their code instead of
our own shader variants. In the past we had to do hide the GL shader
variants in the driver to get precompiles from st, but no longer as of
!8601.
I tested with drawoverhead -test 6 (shader program change, n=30) and -test
1 (no statechanges, n=43) and saw no change in driver overhead.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8997>
Every caller of associate_uniform_storage was doing this to safety-check
that the uniform storage didn't get reallocated, except for
st_deserialise_ir_program(). This ended up leaving an opening for
use-after-free without hitting the assert in the hot-cache case (and I
found it on freedreno). Having added it, it also reveals use-after-frees
in the drawpixels shader variant cases on llvmpipe and zink.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8997>
Since the job is manual, I missed it in the move and it got dropped from
the artifacts.
Fixes: 60d413b894 ("ci: Move the piglit expectations lists to the per-driver CI dirs.")
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9226>
It's just a ton of fuss for driver developers fixing piglit tests. This
makes the trace expectation files pretty silly (empty expectation, but
you'll get a diff to a non-empty result when something fails)
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Adam Jackson <ajax@redhat.com>
Acked-by: Andres Gomez <agomez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9226>
On Gen8, updating the clear color will end up allocating new
SURFACE_STATE entries. These might end up living in a different BO
than the original copies, which means that we have to pin _after_
updating the clear color, not before.
Found by inspection.
Reviewed-by: Zoltán Böszörményi <zboszor@gmail.com>
Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9257>
this requires setting up a spec constant on the pipeline state which can
then propagate to the shader and be used like a regular constant
all ARB_compute_variable_group_size should pass now
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9242>
the hardware supports it, the driver supports it, but the driver reports
a lower value due to subtracting some usage that we shouldn't exceed anyway
Acked-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9198>
non-intel platforms need border colors pre-swizzled
this is an internal khronos spec bug that will (someday) be resolved in
a more detectable manner
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9136>
PIPE_SHADER_CAP_MAX_CONST_BUFFER_SIZE translate into
GL_MAX_*_UNIFORM_COMPONENTS, all of which are allowed to be as
low as 1024 by the GL 4.6 spec.
PIPE_CAP_MAX_SHADER_BUFFER_SIZE translate into
GL_MAX_SHADER_STORAGE_BLOCK_SIZE, which has different minimum values in
different versions of the GL spec. In the GL 4.6 spec for instance, it
is required to be 2^27, the same as what Vulkan requires.
But what these limits are in GL is irrelevant at this level of
abstraction. The OpenGL state-tracker cares, but the Gallium driver
shouldn't have to. So let's just delete those parts of the comments.
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9216>
This allows to split a piglit job in several parallel jobs, to speed up
the execution.
Due piglit restrictions, this only works for single profiles. Otherwise
an error will be shown in the runner.
Also, a new gitlab job variable `PIGLIT_TESTS` is introduced that
contains the excluded/included tests with `-x` or `-n`. The rest of the
piglit options go to `PIGLIT_OPTIONS` (like `--timeout n`).
v2 (Andres):
- Replay profile is supported in parallel jobs.
- Bail out inmediately if parallel jobs is tried with multiple
profiles.
- Use testlist only when doing parallel jobs.
- Do not drop pass tests when filtering executed tests.
- Get rid of PIGLIT_FRACTION.
v4:
- uncommit unrelated change (Andres).
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9022>
Straightforward by using the pixel hashing table computation helper
previously introduced, assuming we know the fraction of work that
needs to be submitted to each pixel pipe. Note that AFAIA the
hardware maps indices in the table to pixel pipes from largest to
smallest, so it shouldn't be necessary to permute indices based on the
physical IDs of the pixel pipes as we are doing on Gen11.
Improves performance of most non-trivial graphics workloads I've tried
on an 80 EU TGL. E.g. the following testcases improve performance
significantly with sample size 27 and statistical significance 1%:
gputest/pixmark_piano: 62.89% ±0.10%
gputest/pixmark_volplosion: 61.51% ±0.06%
unigine/valley: 26.72% ±0.25%
gfxbench/gl_5_high: 24.70% ±0.19%
unigine/heaven: 23.54% ±0.17%
steam/csgo: 22.75% ±4.36%
gfxbench/gl_manhattan31: 22.43% ±0.29%
gfxbench/gl_4: 20.92% ±0.35%
warsow/benchsow: 19.15% ±2.53%
gfxbench/gl_trex_off: 18.84% ±0.27%
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
Pixel hashing tables are a pain to type in, review and maintain IMHO.
In order to obtain satisfactory load balancing on all Gen12 parts
currently in production this series would need to add 5 different
additional tables. Instead this introduces a simple algorithm able to
calculate a table on the fly based on a handful of parameters.
Note that the Gen11 tables generated with this algorithm are not
identical to the hardcoded ones, however the only difference should be
a phase shift that isn't expected to have any effect on performance,
since it shouldn't change the fraction of work submitted to each pixel
pipe.
The CPU overhead from this change is negligible since the tables only
need to be programmed once at context init time.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8749>
Enable pipe capability of exporting stencil from shader when Vulkan
extension is available.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9244>
Enable SPV_EXT_stencil_export and SpvCapabilityStencilExportEXT and
mark output with FragStencilRefEXT when fragment shader writes to
reference stencil value.
Signed-off-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9244>
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Antonio Caggiano <antonio.caggiano@collabora.com>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9244>
this attempts to dynamically establish an upper bound for per-batch descriptor
use, flushing all batches and resetting the pools on alloc failure in
an attempt to be more robust about it
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9117>
previously if any of the pending clears required an explicit clear then
we'd clear them explicitly, but with this patch we're shifting the first
pending clear into the renderpass begin if possible and then applying the
remaining clears on top of that in order to reduce gpu operations
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9206>
we have src regions for all the blit/copy/map calls, so we can use those to
verify whether we actually need to apply the clears now or if we can keep
sitting on them a while longer
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9206>
if we know we're going to be reading from a region then we can examine the
pending clears to see if there's any overlap, which helps us decide whether
we need to apply them immediately
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9206>
this just requires adding an explicit flush for the deferred clears in
case conditional rendering is disabled before a draw happens
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9206>
instead, we can attach the clear to the next renderpass start and even add it to
the renderpass cache for reuse
also add handling for flushing clears on map or fb switching to avoid brekaing behavior
this should save us a lot of time with potentially beginning/ending renderpasses as well
as allowing drivers to do better batching of clears by passing in all the buffers at
once
this doesn't handle deferring conditional renders yet in a futile attempt to try and keep
the size of the patch down
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9206>
Dead-cells (and perhaps others) does MapBufferRange(UNSYNC|DISCARD_RANGE)
to update single buffered VBOs every frame in the opening menu screen,
and because we were considering UNSYNC with higher priority than
DISCARD_RANGE, this would result in the game racing VBO updates between
binning and tile passes, causing visibility stream inconsistency between
the two passes, resulting in "tile flicker".
The letter of the gl spec implies this is undefined behavior (at least
to my reading of it). But we already hand DISCARD_RANGE in the !UNSYNC
case, so just go down this path instead. It means we could potentially
end up invalidating (and back-blit) in cases where the app really knows
what it is doing, but oh well.
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4337
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9223>
We have the software decode hooked up now, and it's not like this
decompression is more expensive than a lot of other software decompression
we have. One less feature to be missing from the old swrast classic driver.
Reviewed-by: Adam jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9194>
dEQP in CI was hitting these, and debug_printf is not enabled on non-debug
(such as debugoptimized or release) builds. Besides, mesa_loge() gets you
logging on Android, should someone ever do zink for that.
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8891>
when discarding the whole resource on an unused resource, we can deinit the buffer
range here
in the future, ideally we should be doing something like creating a new vk buffer/image
entirely here and demoting the existing one to a queue that destroys/caches it when
the batch finishes in order to avoid fencing
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9146>
we don't actually have to stall here, we just have to make sure the cs batch
is submitted before the subsequent buffer copy command goes into a gfx
batch in order to preserve the ordering
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9146>
if we need to sync a resource for read-only mapping, we only need to wait on
the fence that was last flagged as having writes to the resource, not batches
that may have reads, as reads don't affect memory coherency
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9146>
this now takes over from previous call sites in zink_transfer_unmap
we add a flush here if we had pending usage to ensure that the resource
gets properly synced
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9146>
this simplifies the barrier helper callsites as we no longer need
to check whether a barrier is needed beforehand in order to avoid potentially
ending a renderpass too early
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9205>
This means less custom test-source-dep stuff for these drivers, though it
means that touching the CI expects files will cause a bit more retesting:
- broadcom drivers retest as a group (but Igalia requested that
organization of CI files)
- radv+radeonsi retest as a group
- lvp+llvmpipe retest as a group
Acked-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9161>
The code was using the core versions, but for vulkan 1.0 at
least the extension versions are what is required to be asked for.
This fixes zink loading on lavapipe, and blows up the CI.
v2: Use LOCAL device extension variant (zmike)
Reviewed-by: Eric Anholt <eric@anholt.net>
Reviewed-By: Mike Blumenkrantz <michael.blumenkrantz@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9185>
We don't support it yet.
Fixes: 61d3ae6e0b ("panfrost: Initial stub for Panfrost driver")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9164>
dEQP-GLES31.functional.texture.multisample.samples_16.sample_position is
failing on Bifrost, and we already weren't advertising on Midgard. Until
someone gets spare cycles to debug all the problems, keep it hidden
behind a debug flag to avoid introducing bugs.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9164>
The blob advertises 64 for this, so let's use the same value. Small
alignments are observed to raise an IMPRECISE_FAULT at least on Bifrost.
As a bonus this forces cache line alignment which will help perf. Fixes
dEQP-GLES31.functional.texture.texture_buffer.render.as_vertex_texture.offset_1_alignments
Fixes: 5f7bafa316 ("panfrost: Enable ARB_texture_buffer_object")
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9164>
Auditing nr_cbufs. Note we have to augment the blending logic a bit to
use the same 'no blend' case for missing RTs as we do for depth-only
passes.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9164>
Already checked in the callee, no need to check in the caller.
Signed-off-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9164>
these caps correspond to descriptor binding limits provided by vulkan drivers
fixes KHR-GL46.shader_storage_buffer_object.basic-max
Reviewed-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9174>
this is a huge perf win since it means we don't have to create a new pipeline
every time this state changes
also we can now move the viewport state back to zink_context since that's the
real value we're using and the pipeline state value is just for the hash
ref mesa/mesa#3359
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9153>
this is just much, much easier to handle, and it also lets us fix some
lingering bugs with query handling that led to inconsistent results
Acked-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9152>
we can reduce some flushing here by only doing a flush if we're about to
copy a compute batch resource that has gfx batch access pending
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9152>
in this case, we can queue a result copy onto a staging buffer and then queue
a copy from staging onto our real buffer to avoid stalling
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9152>
* VK_BUFFER_USAGE_STORAGE_BUFFER_BIT should be enabled always because we might need it
* VK_FORMAT_FEATURE* flags need to be used for detection
I hate these enums so much.
Fixes: 2bfa998960 ("zink: add more usage bits for buffer types")
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9175>
If this is not defined, mesa may not deallocate sampler views,
which can result in memory leaks.
Just define it to the same as max texture samplers, like other
mesa drivers do.
Cc: mesa-stable
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9172>
We call update_last_vue_map after updating the shaders, which compares
the new and old VUE maps. Except...updating the shaders may have
dropped the last reference to the variant that ice->shaders.last_vue_map
belonged to, leading to a classic use-after-free.
Fix this by taking a reference to the variant for the last VUE stage,
so it stays around until we're done with it.
Fixes: 1afed51445 ("iris: Store a list of shader variants in the shader itself")
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/4311
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9143>
Before, we only used 2k of shared memory.
It was found that 5 lower bits of SP_CS_UNKNOWN_A9B1 do control
the available size of shared memory for compute shaders, with
AVAILABLE_SIZE = (SP_CS_UNKNOWN_A9B1_SHARED_SIZE + 1) * 1k
up to 32k. And SP_CS_UNKNOWN_A9B1_SHARED_SIZE being zero enables
all 32k of shared memory.
Fixes tests:
dEQP-VK.rasterization.line_continuity.line-strip
dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.buffer.guard_nonlocal.workgroup.comp
dEQP-VK.memory_model.message_passing.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_nonlocal.workgroup.guard_local.buffer.comp
dEQP-VK.memory_model.write_after_read.core11.u32.coherent.fence_fence.atomicwrite.workgroup.payload_local.image.guard_nonlocal.workgroup.comp
Signed-off-by: Danylo Piliaiev <dpiliaiev@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9157>
The top-level gitlab-ci.yml is big and unwieldy when one wants to work on
CI for a single driver. Move the drivers to separate include files for
ease of finding all your driver's tests, and also to pave the way for work
on a single driver's CI to not retest all other drivers.
Reviewed-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9139>
struct thread_trace_data holds struct rgp_code_object, struct
rgp_loader_events, struct rgp_pso_correlation data. This data is required
in function ac_sqtt_dump_data(). This patch makes the code changes
required to pass struct thread_trace_data to function ac_sqtt_dump_data().
Signed-off-by: Yogesh Mohan Marimuthu <yogesh.mohanmarimuthu@amd.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8609>
Fix defects reported by Coverity Scan.
uninit_member: Non-static class member serial is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member sched is not initialized in this constructor nor in any functions that it calls.
uninit_member: Non-static class member bb is not initialized in this constructor nor in any functions that it calls.
Signed-off-by: Vinson Lee <vlee@freedesktop.org>
Reviewed-by: Karol Herbst <kherbst@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9037>
Reduce noise in a6xx.xml by removing LO/HI versions of address registers.
Also fix type="address" registers in register packing (use bit size instead
of checking for "waddress" to use qword)
Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8423>
since we aren't (currently) parallelizing and now have barriers, we don't need to cycle
the batch here, which lets us avoid submitting too-small command buffers
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9148>
if first==last layer, this should be a 2D slice of the cube
else if this isn't all the layers, this should be an array of slices
fixes a bunch of spec@arb_shader_image_size@builtin cases
Reviewed-by: Dave Airlie <airlied@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9080>