Previously it would fail, and then we'd fall back to the transfer path
for things like readpix. But it would spam logcat w/ bo_mmap fail
messages. Since gralloc allocated buffers for GPU usage are allocate
without _USE_MAPPABLE, let's just assume we can't map imported bo's.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16477>
This is needed for the VIRTGPU_WAIT ioctl to work.
TODO we could perhaps limit this, since it is not needed for residency,
but only fencing. Ie. we could omit cmdstream, and probably anything
that has FD_BO_NOMAP flag.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16086>
This syncs up with the protocol of what eventually landed in virglrender.
1) Move all static params to capset to avoid having to query host
(reduce synchronous round trips at startup)
2) Use res_id instead of host_handle.. costs extra hashtable lookups in
host during submit, but this lets us (with userspace allocated IOVA)
make bo alloc and import completely async.
3) Require userspace allocated IOVA to simplify the protocol and not
have to deal with GEM_NEW/GEM_INFO potentially being synchronous.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16086>
These paths should be corner cases, but still it is a bad idea to block
in the host (because it is single threaded), so instead just turn waits
in the host into polling in the guest.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16086>
If supported by host virglrenderer and host kernel, use userspace
allocated GPU virtual addresses. This lets us avoid stalling on
waiting for response from host kernel until we need to know the
host handle (which is usually not until submit time).
Handling the async response from host to get host_handle is done
thru the submit_queue, so that in the submit path (hot) we do not
need any additional synchronization to know that the host_handle
is valid.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16086>
ring_idx zero is the CPU ring, others map to the priority level, as each
priority level for a given drm_file on the host kernel side maps to a
single fence timeline.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/16086>
We don't need to restrict our timeout to 5 seconds, because the kernel's
hangcheck will ensure that the wait completes in finite time if the GPU
gets wedged. If the GPU is making progress, we don't want to time out
early and have pipe_transfer_map() return an error, causing glReadPixels()
to throw a confusing GL_OOM even though we're not out memory.
The INFINITE arg to this function isn't actually infinite, it's limited to
an hour. But an hour of GPU processing to wait on is probably plenty.
This 5s timeout has caused problems with the CTS on freedreno at high
parallelism, and I suspect is the cause of recent issues in the closed
traces replay jobs.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15805>
Add a new backend to enable using native driver in a VM guest, via a new
virtgpu context type which (indirectly) makes host kernel interface
available in guest and handles the details of mapping buffers to guest,
etc.
Note that fence-fd's are currently a bit awkward, in that they get
signaled by the guest kernel driver (drm/virtio) once virglrenderer in
the host has processed the execbuf, not when host kernel has signaled
the submit fence. For passing buffers to the host (virtio-wl) the egl
context in virglrenderer is used to create a fence on the host side.
But use of out-fence-fd's in guest could have slightly unexpected
results. For this reason we limit all submitqueues to default priority
(so they cannot be preepmted by host egl context). AFAICT virgl and
venus have a similar problem, which will eventually be solveable once we
have RESOURCE_CREATE_SYNC.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14900>
Call backend specific cleanup fxn earlier. This is needed if the
backend has things like bo's to delete, otherwise the handle_table
will already be destroyed causing problems in bo_del()
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14900>
We are going to want basically the identical thing, other than
flush_submit_list, for virtio backend. Now that we've moved various
other dependencies into the base classes, extract out an abstract base
class for submit/ringbuffer.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14900>
The virtio backend will want this too, and it will make it easier to
share most of the submit/ringbuffer implementation with the virtio
backend.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14900>
With userspace fences, if we know definitely that the buffer is idle
(which implies that it is not shared with other processes, etc), then
skip the ioctl.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14900>
There are some buffers that we mmap just to write to them a single time.
Add the possibility of the drm backend to provide an alternate upload
path to avoid these mmap's.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14900>
With the virtio backend we will need to pass an extra flag when
allocating buffers that will be shared cross-device (such as with
virtio-wl for passing between host and guest)
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14900>
Add a hint for buffers that we won't need to mmap. With the virtio
backend, virglrenderer needs to create a dmabuf fd for mapping into
the host, which we want to avoid when possible.
Low hanging fruit is to use this hint for anything tiled/ubwc. There
are probably more bo's that can be flagged as such.
TODO add fd_bo_upload() for memcpy to bo.. this would be useful for
uploads, for example, shaders which we just write once and never touch
again.. for virtio this could be implemented with a TRANSFER_TO_HOST
ioctl.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14900>
Decoupling handle and fd_bo creation simplifies things for "normal" drm
drivers, avoiding duplication for the create vs import paths. But this
is awkward for the virtio backend when wants to do multiple things in
the same guest<->host round trip.
So instead, split the paths in the interface backend and move the code
sharing for the two different paths into the msm backend itself.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14900>
This allows the exported fds to be mapped for writing. My use case is
for virtio-gpu blob resources where the fds are mapped rw and mappings
are added to the guests using KVM_SET_USER_MEMORY_REGION.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14699>
Dropping the final pipe ref could in turn drop the final ref to one
of a couple other bo's, which in turn could indirectly recurse back
into cleanup_fences() on the same bo, resulting in a double decrement
of bo->nr_fences and underflow to a large positive #. This happens
because free'ing a bo back to the bo cache periodically calls
fd_bo_cache_cleanup() and any bo's that have not been re-used can
be really free'd, which in turn calls cleanup_fences().
Fixes: 7dabd62464 ("freedreno/drm: Userspace fences")
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13263>
In some cases we need to emit a no-op batch/submit, just to get a fence.
No need to emit all the boilerplate state-resture and flushing in this
case.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13160>
Userspace frequently reads the elapsed fence, but the GPU only writes it
once per submit. So this should be another useful place for cached-
coherent.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11176>
Some more extreme examples, like gl_driver2_off, can be bottlenecked on
writes to cmdstream. OTOH the CP is pretty pipelined in how it slurps
in memory, so the penalty of using coherent buffers should not be so
much.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11176>
It hasn't really mattered until now, as we keep a separate cache for
cmdstream (which is FD_BO_GPU_READONLY), and the only other flag so
far is FD_BO_SCANOUT (which the bo cache probably messes up, but it
does not matter on most hw, and on hw where it does the scanout buffer
will be imported (and therefore won't end up in the bo cache).
But when we add cached-coherent (or if we wanted to use GPU_READONLY
more) it starts to matter.
Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11176>
When moving the batch cache to the context, I added hash table lookups
from batch to rsc for "is this resource in use" because we could no longer
store data in the rsc bo under the batch cache's lock.
We can save that cost by tracking a bitfield of resources referenced by
the batch, which gives us very cheap checks in the draw path at a minor
cost in memory. We can just use the GEM BO handle, since it's a nice
small integer already (we can't use the TC buffer ID, because the frontend
changes that, and we're in the driver thread).
This required moving the !pending() assert up in resource shadowing, since
the BO swap meant we were checking pending on the wrong resource.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11511>
The fd_fence_finish() may be passed a special timeout value PIPE_TIMEOUT_INFINITE.
This gets propagated all the way to get_abs_timeout(), where it gets converted to
a huge timeout value and passed down to the kernel. At least on iMX53, the kernel
may complain about this value being too large and emit a backtrace. The relevant
piece of information there is the following:
schedule_timeout: wrong timeout value bf94984b
Per suggestion by Rob Clark, fix this in get_abs_timeout() by picking the same
rollover implementation present in etnaviv. This fixes one part of the problem
where the tv_nsec becomes larger than NSEC_PER_SEC, which is invalid.
However, the PIPE_TIMEOUT_INFINITE is sufficiently large to make tv_secs larger
than KTIME_SEC_MAX, which makes kernel-side ktime_set() return KTIME_MAX and
that in turn triggers the above "wrong timeout value N" message. Fix this by
setting the timeout to large enough value in case of PIPE_TIMEOUT_INFINITE.
While the timeout is not truly infinite, the timeout is long enough as anything
longer than a few seconds means the GPU got hung.
The "util/timespec.h" is added so we can use NSEC_PER_SEC instead of ad-hoc
constant 1000000000 . The "pipe/p_defines.h" is needed for PIPE_TIMEOUT_INFINITE.
This problem can be reliably triggered on iMX53 using Qt5 with EGLFS support,
using the qtbase examples, as follows:
/usr/share/examples/opengl/qopenglwidget/qopenglwidget -platform eglfs
Fixes: f3cc0d2747 ("freedreno: import libdrm_freedreno + redesign submit")
Signed-off-by: Marek Vasut <marex@denx.de>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12886>