If two contexts wanted to access the same buffer at the same time, it would
end up on two validation lists simultaneously, which might cause a
PIPE_ERROR_RETRY when trying to validate it from one context while the other
context already had it validated but not yet fenced.
In that situation we could spin until the error goes away, or apply various
more or less expensive locking schemes to save cpu.
Here we use a scheme that briefly locks after fencing but avoids locking on
validation in the non-contended case.
v2:
Make sure we broadcast not only on releasing buffers after fencing, but also
after releasing buffers in the pb_validate_validate error path.
v3:
Don't broadcast on PIPE_ERROR_RETRY because that would increase the chance
of starvation.
Signed-off-by: Thomas Hellstrom <thellstrom@vmware.com>
3D wasn't officially supported before virtual HW version 8 so we can
remove this old code.
Reviewed-by: Charmaine Lee <charmainel@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
Currently, surface propagation for colliding render target resource is
done at framebuffer emit time for vgpu10. This patch
adds the surface propagation for non-vgpu10 path to emit_fb_vgpu9()
and removes the redundant surface copy at set time.
Tested with MTT glretrace, piglit, NobelClinicianViewer, Turbine, Cinebench.
Reviewed-by: Neha Bhende <bhenden@vmware.com>
The zslice index to svga_texture_copy_handle_resource() is not adjusted
and should be a signed integer.
This patch fixes piglit tests for non-vgpu10 including
spec@arb_framebuffer_object@fbo-generatemipmap-3d
spec@glsl-1.20@execution@tex-miplevel-selection gl2:texture* 3d
Tested with MTT piglit and glretrace
This implementation is based on querying the time just before swap/present
and doing a Sleep() if needed. There is no sync to vblank or actual
coordination with the GPU. This isn't perfect, but basically works.
We've had some request for this functionality, and it sounds like there
are some Windows GL apps that refuse to start if the driver doesn't
advertise this extension.
Note: NVIDIA's Windows OpenGL driver advertises the WGL_EXT_swap_control
string both with wglGetExtensionsStringEXT() and with
glGetString(GL_EXTENSIONS). We're only advertising it with the former at
this time.
Tested with asst. Mesa demos, Google Earth, Lightsmark, etc.
VMware bug 1591534.
Reviewed-by: José Fonseca <jfonseca@vmware.com>
When a backing surface is reused, it is possible that
the original surface has been changed. So before the backing surface
is bound again, we need to sync up the surface.
This patch creates a new helper function svga_texture_copy_handle_resource()
to sync up the backing surface resource.
This patch, together with the backing surface dirty bit fix, fixes
the rendering corruption in NobelClinicianViewer when rotating the model.
Also tested with MTT glretrace, piglit, Cinebench, Turbine.
Reviewed-by: Brian Paul <brianp@vmware.com>
The reset flag specifies if the dirty bit needs to be reset
after the surface is propagated to the texture. This is used
to make sure that the dirty bit is not reset and stay unset
before the surface is unbound.
Reviewed-by: Brian Paul <brianp@vmware.com>
The new has_backed_views flag specifies if any of the render target
views or depth stencil view is a backing surface view.
The flag is used in svga_propagate_rendertargets() so it can return early
if there is no surface to propagate.
Reviewed-by: Brian Paul <brianp@vmware.com>
A texture can be destroyed from a different context from which it is
created, but destroying the render target view from a different context
will cause svga device errors. Similar to shader resource view,
this patch skips destroying render target view or depth stencil view
from a non-parent context.
Fixes driver errors running NobelClinician Viewer application.
Tested with NobelClinician Viewer, MTT piglit, glretrace.
Reviewed-by: Brian Paul <brianp@vmware.com>
With this patch, rasterization will be disabled if the
rasterizer_discard flag is set or the fragment shader
is undefined due to missing position output from the
vertex/geometry shader.
Tested with piglit test glsl-1.50-geometry-primitive-id-restart.
Also tested with full MTT glretrace and piglit.
v2: As suggested by Roland, to properly disable rasterization, besides
setting FS to NULL, we will also need to disable depth and stencil test.
v3: As suggested by Brian, set SVGA_NEW_DEPTH_STENCIL_ALPHA dirty bit
in svga_bind_rasterizer_state() if the rasterizer_discard flag is
changed.
Reviewed-by: Brian Paul <brianp@vmware.com>
Emulating wide points in geometry shader when doing transform feedback
is problematic. This patch disables the emulation.
Tested with piglit test ext_transform_feedback-points.
Also tested with MTT glretrace, mesa demos pointblast and spriteblast.
Reviewed-by: Brian Paul <brianp@vmware.com>
Commit b2c97bc789 which made us start
using a busy-wait for individual query results also messed up cache
flushing on !LLC platforms. For one thing, I forgot the mfence after
the clflush so memory access wasn't properly getting fenced. More
importantly, however, was that we were clflushing the whole query range
and then waiting for individual queries and then trying to read the
results without clflushing again. Getting the clflushing both correct
and efficient is very subtle and painful. Instead, let's side-step the
problem by just snooping.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Otherwise linking way fail.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100600
Fixes: f195d40eca ("anv/device: Add a helper for querying whether a BO is busy")
Signed-off-by: Emil Velikov <emil.velikov@collabora.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Tested-by: Vinson Lee <vlee@freedesktop.org>
To work with addr2line.sh we also need the relative offset within the
DSO. And addr2line.sh gets confused by the leading stackframe number.
Signed-off-by: Rob Clark <robdclark@gmail.com>
We don't need to call _mesa_reference_renderbuffer() for the first
assignment as refCount starts at 1. For swrast we work around the
fact we will indirectly call _mesa_reference_renderbuffer() by
resetting refCount to 0.
Fixes: 32141e53d1 (mesa: tidy up renderbuffer RefCount initialisation)
Reviewed-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
On Broadwell we still need to do a resolve between the subpass
that writes and the subpass that reads when there is a
self-dependency because HW could not see fast-clears and works
on the render cache as if there was regular non-fast-clear surface.
Fixes 16 tests on BDW:
dEQP-VK.renderpass.formats.*.input.clear.store.self_dep*
Signed-off-by: Samuel Iglesias Gonsálvez <siglesias@igalia.com>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This appears to be a leftover from an earlier version of this function.
Nothing is emitted into the CS.
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
All offsets and strides are precomputed by
radv_CreateDescriptorUpdateTemplateKHR and stored in the template.
v2: Move the new struct declarations from radv_descriptor_set.h
to radv_private.h (Bas)
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Replace the !binding_layout->immutable_samplers assertion in
radv_update_descriptor_sets with a conditional.
The Vulkan specification does not say that it is illegal to update
a sampler descriptor when it is immutable; only that pImageInfo is
ignored.
This change is also needed for push descriptors, because valid
descriptors must be pushed for all bindings accessed by shaders,
including immutable sampler descriptors.
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Move the implementation into a separate function that takes a
cmd_buffer and a dstSetOverride parameter.
When cmd_buffer is not NULL, radv_update_descriptor_sets calls
cs_add_buffer directly instead of updating the buffer list.
This will be used to implement VK_KHR_push_descriptor.
Signed-off-by: Fredrik Höglund <fredrik@kde.org>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
_mesa_lookup_renderbuffer() already checks if 'id' is non-zero.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
This doesn't do anything useful so just remove it.
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
This doesn't do anything useful so just remove it.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Commit f938354362 recently increased the
alignment on vertex buffer data from 32 to 64. This caused us to
consume a bit more batch than we were before and we now go over the
estimate by a small amount on certain blits on gen8+. This commit bumps
then gen8 batch estimate by a bit to compensate. Haswell and older
still seems to be well within the limit.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100582
Reviewed-by: Iago Toral Quiroga <itoral@igalia.com>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "13.0 17.0" <mesa-stable@lists.freedesktop.org>
Decoding with aubinator encountered a command of 0xffffffff. With the
previous code, it caused aubinator to jump 255 + 2 dwords to start
decoding again.
Instead we can attempt to detect the known instruction formats. If the
format is not recognized, then we can advance just 1 dword.
v2:
* Update aubinator_error_decode
* Actually convert the length variable returned into a *signed* integer
in aubinator.c, intel_batchbuffer.c and aubinator_error_decode.c.
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Acked-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The call to gen_print_group should provide a pointer to the beginning
of the the structure data, not the start of the batch data.
Cc: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
The system value only has an X component, and radeonsi started
checking that in debug builds.
Reported-by: Michel Dänzer <michel.daenzer@amd.com>
Fixes: 4cf2942777 ("radeonsi: support 64-bit system values")
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Before, we were just looking at whether or not the user wanted us to
wait and waiting on the BO. Some clients, such as the Serious engine,
use a single query pool for hundreds of individual query results where
the writes for those queries may be split across several command
buffers. In this scenario, the individual query we're looking for may
become available long before the BO is idle so waiting on the query pool
BO to be finished is wasteful. This commit makes us instead busy-loop on
each query until it's available.
This significantly reduces pipeline bubbles and improves performance of
The Talos Principle on medium settings (where the GPU isn't overloaded
with drawing) by around 20% on my SkyLake gt4.
Reviewed-by: Chris Wilson <chris@chris-wilson.co.uk>
Tested-by: Eero Tamminen <eero.t.tamminen@intel.com>
Tested-by: Grazvydas Ignotas <notasas@gmail.com>
mako is already a mesa build requirement, extra copy not needed.
Tested building against mesa build baseline (mako-0.8.0).
Reviewed-by: Bruce Cherniak <bruce.cherniak@intel.com>
This avoids validation and looking up the buffer target for a second time.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
This allows internal users to pass buffer objects directly and
allows for KHR_no_error support to be more easily added.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
Possibly more efficient, either way it makes the code easier to
follow.
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
42aaa548 changed the renderbuffer initialisation of RefCount from
1 to 0.
This is inconsitent with how we use RefCount elsewhere. Also every
driver implementation of NewRenderbuffer() calls
_mesa_init_renderbuffer() so its safe to set it there.
Reviewed-by: Brian Paul <brianp@vmware.com>