Commit Graph

97598 Commits

Author SHA1 Message Date
Dylan Baker 3e9533d9b8 meson: Add script to use VERSION file for getting version
Meson has up until this point set it's version in the root meson.build
script, while the other build systems read the VERSION file. This is
just "one more thing" to duplicate between meson and every other build
system. This script is a simple "read, strip, print" sort of deal to
allow meson to read the VERSION file.

I chose to implement this in python since python is portable, and to
keep the meson.build script clean. This is also complicated by the fact
that the project() call *must* be the first non-comment,non-blank in the
toplevel meson.build script.

v2: - Move from scripts/ to bin/
    - use python explicitly to run the scripts to support windows

Signed-off-by: Dylan Baker <dylanx.c.baker@intel.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
2017-11-09 11:19:53 -08:00
Boris Brezillon 359a8f6ae5 broadcom/vc4: Mark BOs as purgeable when they enter the BO cache
This patch makes use of the DRM_IOCTL_VC4_GEM_MADVISE ioctl to mark all
BOs placed in the mesa BO cache as purgeable so that the system can
reclaim this memory under memory pressure.

v2:
- Removed BOs from the cache when they've been purged by the kernel
- Check whether the madvise ioctl is supported or not before using it

v3: Don't walk the whole list when we find a busy BO (by anholt, acked by
    Boris)

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-09 10:57:17 -08:00
Boris Brezillon 7d72af4f09 drm-uapi: Update vc4 header from drm-next
Taken from drm-next d65d31388a23 ("Merge tag
'drm-misc-next-fixes-2017-11-07' of
git://anongit.freedesktop.org/drm/drm-misc into drm-next")

v2: Add the NOTSUPP definition from the final drm-next version, not the
    commit (anholt).

Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com>
2017-11-09 10:57:14 -08:00
Eric Anholt ebcb4c2156 meson: Enable VC4's NEON assembly support.
Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Reviewed-by: Eric Engestrom <eric.engestrom@imgtec.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-09 09:40:30 -08:00
Eric Anholt 9c9fd8ff37 meson: Always link libgallium_dri.so against dep_thread.
Somehow on my cross build the -pthread is getting lost.  All the other
deps seem to work out fine.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-09 09:40:27 -08:00
Eric Anholt 4c94607c21 meson: Drop stale comment about making valgrind conditional.
It was fixed in 5c2ff5773a.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-09 09:40:25 -08:00
Eric Anholt d975e6940e meson: Leave dep_llvm empty if !with_llvm
The gallium auxiliary build would link against llvm, for the gallivm code
that it didn't build.  This broke the build on my armhf cross, where
libLLVM-3.9.so is not multiarch and thus points to x86-64 libs.

Reviewed-by: Dylan Baker <dylan@pnwbakers.com>
Tested-by: Timothy Arceri <tarceri@itsqueeze.com>
2017-11-09 09:40:03 -08:00
Adam Jackson 015cc6bb7c Revert "glx: Implement GLX_EXT_no_config_context (v2)"
Pushed ahead of things actually working.

This reverts commit 5293b96b16.
2017-11-09 11:41:14 -05:00
Marek Olšák 9ceb057ebf radeonsi: pack r600_surface better
160 -> 136 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák 169525684f radeonsi: pack r600_texture better
1752 -> 1736 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák f8a4b606a2 radeonsi: clean up r600_surface
216 -> 160 bytes

Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák 6916ee7e17 radeonsi: remove r600_texture::non_disp_tiling
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Marek Olšák a06fe75eac radeonsi: remove DBG_NO_DISCARD_RANGE
Reviewed-by: Nicolai Hähnle <nicolai.haehnle@amd.com>
2017-11-09 17:32:14 +01:00
Adam Jackson 5293b96b16 glx: Implement GLX_EXT_no_config_context (v2)
This more or less ports EGL_KHR_no_config_context to GLX.

v2: Enable the extension only for those backends that support it.

Khronos: https://github.com/KhronosGroup/OpenGL-Registry/pull/102
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-09 09:35:30 -05:00
Adam Jackson 3f66d54a2a glx: Prepare the DRI backends for GLX_EXT_no_config_context
This should be safe as these backends already support the EGL version of
this extension. DRI1 is not affected because it does not support
GLX_ARB_create_context anyway. DRI-Windows is not prepared to implement
this as there's no equivalent WGL extension, and wglCreateContextAttribs
seems to really want the HDC's pixel format to be set.

Signed-off-by: Adam Jackson <ajax@redhat.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
2017-11-09 09:35:22 -05:00
Adam Jackson 74b701d84c glx: Relax validate_renderType_against_config for EXT_no_config_context
Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Adam Jackson <ajax@redhat.com>
2017-11-09 09:35:07 -05:00
Nicolai Hähnle ffc2060616 anv: fix build failure
Fixes: e3a8013de8 ("util/u_queue: add util_queue_fence_wait_timeout")
2017-11-09 14:49:19 +01:00
Nicolai Hähnle cc78d77043 mesa: flush and wait after creating a fallback texture
Fixes non-deterministic failures in
dEQP-EGL.functional.sharing.gles2.multithread.simple_egl_sync.images.texture_source.teximage2d_render
and others in dEQP-EGL.functional.sharing.gles2.multithread.*

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:20:58 +01:00
Nicolai Hähnle 46444613cf mesa: increase MaxServerWaitTimeout
The current value was introduced in commit a27180d0d8, which claims
that it represents ~1.11 years. However, it is interpreted in nanoseconds,
so it actually only represents ~9.8 hours. That seems a bit short.

Use the largest value consistent with both int32 and int64. It
corresponds to ~292 years in nanoseconds.

Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:20:58 +01:00
Nicolai Hähnle fbda7958ff st/mesa: remove redundant flushes from st_flush
st_flush should flush state tracker-internal state and the pipe, but
not mesa/main state. Of the four callers:

- glFlush/glFinish already call FLUSH_{VERTICES,STATE}.
- st_vdpau doesn't need to call them.
- st_manager will now call them explicitly.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:20:58 +01:00
Nicolai Hähnle 884a0b2a9e st/dri: use stapi flush instead of pipe flush when creating fences
There may be pending operations (e.g. vertices) that need to be flushed
by the state tracker.

Found by inspection.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:20:58 +01:00
Nicolai Hähnle b921da3b74 radeonsi: use a threaded context even for debug contexts
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:04 +01:00
Nicolai Hähnle 1a6d9e087a radeonsi: record and dump time of flush
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:04 +01:00
Nicolai Hähnle b07569ad8b ddebug: optionally handle transfer commands like draws
Transfer commands can have associated GPU operations.

Enabled by passing GALLIUM_DDEBUG=transfers.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle 18fd2a859d ddebug: dump context and before/after times of draws
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle ba2f2b6f2a ddebug: generalize print_named_xxx via a PRINT_NAMED macro
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle c9fefa062b ddebug: rewrite to always use a threaded approach
This patch has multiple goals:

1. Off-load the writing of records in 'always' mode to another thread
   for performance.

2. Allow using ddebug with threaded contexts. This really forces us to
   move some of the "after_draw" handling into another thread.

3. Simplify the different modes of ddebug, both in the code and in
   the user interface, i.e. GALLIUM_DDEBUG. In particular, there's
   no 'pipelined' anymore, since we're always pipelined; and 'noflush'
   is replaced by 'flush', since we no longer flush by default.

4. Fix the fences in pipelining mode. They previously relied on writes
   via pipe_context::clear_buffer. However, on radeonsi, those could
   (quite reasonably) end up in the SDMA buffer. So we use the newly
   added PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE fences instead.

5. Improve pipelined mode overall, using the finer grained information
   provided by the new fences.

Overall, the result is that pipelined mode should be more useful, and
using ddebug in default mode is much less invasive, in the sense that
it changes the overall driver behavior less (which is kind of crucial
for a driver debugging tool).

An example of the new hang debug output:

  Gallium debugger active.
  Hang detection timeout is 1000ms.
  GPU hang detected, collecting information...

  Draw #   driver  prev BOP  TOP  BOP  dump file
  -------------------------------------------------------------
  2          YES      YES    YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000000
  3          YES      NO     YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000001
  4          YES      NO     YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000002
  5          YES      NO     YES  NO   /home/nha/ddebug_dumps/shader_runner_19919_00000003

  Done.

We can see that there were almost certainly 4 draws in flight when
the hang happened: the top-of-pipe fence was signaled for all 4 draws,
the bottom-of-pipe fence for none of them. In virtually all cases,
we'd expect the first draw in the list to be at fault, but due to the
GPU parallelism, it's possible (though highly unlikely) that one of
the later draws causes a component to get stuck in a way that prevents
the earlier draws from making progress as well.

(In the above example, there were actually only 3 draws truly in flight:
the last draw is a blit that waits for the earlier draws; however, its
top-of-pipe fence is emitted before the cache flush and wait, and so
the fact that the draw hasn't truly started yet can only be seen from a
closer inspection of GPU state.)

Acked-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle e8bb8758dd ddebug: use an atomic increment when numbering files
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle d6710fe874 dd/util: extract dd_get_debug_filename_and_mkdir
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:03 +01:00
Nicolai Hähnle 8491fcafab gallium/u_dump: add and use util_dump_transfer_usage
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:02 +01:00
Nicolai Hähnle 9b8033a4a7 gallium/u_dump: add util_dump_ns
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:02 +01:00
Nicolai Hähnle 6f4a03b08a gallium/u_dump: export util_dump_ptr
Change format to %p while we're at it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:01:02 +01:00
Nicolai Hähnle 125a915052 radeonsi: implement PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE
v2: use uncached system memory for the fence, and use the CPU to
    clear it so we never read garbage when checking the fence

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:55 +01:00
Nicolai Hähnle e4627ac8fb radeonsi: document some subtle details of fence_finish & fence_server_sync
v2: remove the change to si_fence_server_sync, we'll handle that more
    robustly

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:50 +01:00
Nicolai Hähnle 14b9fa75e4 gallium: add pipe_context::callback
For running post-draw operations inside the driver thread. ddebug will
use it.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:50 +01:00
Nicolai Hähnle 2bdfbb0e53 gallium/u_threaded: implement pipe_context::set_log_context
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:49 +01:00
Nicolai Hähnle 244536d3d6 gallium/u_threaded: avoid syncs for get_query_result
Queries should still get marked as flushed when flushes are executed
asynchronously in the driver thread.

To this end, the management of the unflushed_queries list is moved into
the driver thread.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:49 +01:00
Nicolai Hähnle 609a230375 gallium/u_threaded: implement asynchronous flushes
This requires out-of-band creation of fences, and will be signaled to
the pipe_context::flush implementation by a special TC_FLUSH_ASYNC flag.

v2:
- remove an incorrect assertion
- handle fence_server_sync for unsubmitted fences by
  relying on the improved cs_add_fence_dependency
- only implement asynchronous flushes on amdgpu

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:42 +01:00
Nicolai Hähnle 11b380ed0c gallium/u_threaded: mark queries flushed only for non-deferred flushes
The driver uses (and must use) the flushed flag of queries as a hint that
it does not have to check for synchronization with currently queued up
commands. Deferred flushes do not actually flush queued up commands, so
we must not set the flushed flag for them.

Found by inspection.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:42 +01:00
Nicolai Hähnle 78a4750d91 radeonsi: move fence functions to si_fence.c
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:42 +01:00
Nicolai Hähnle e6dbc804a8 winsys/amdgpu: handle cs_add_fence_dependency for deferred/unsubmitted fences
The idea is to fix the following interleaving of operations
that can arise from deferred fences:

 Thread 1 / Context 1          Thread 2 / Context 2
 --------------------          --------------------
 f = deferred flush
 <------- application-side synchronization ------->
                               fence_server_sync(f)
                               ...
                               flush()
 flush()

We will now stall in fence_server_sync until the flush of context 1
has completed.

This scenario was unlikely to occur previously, because applications
seem to be doing

 Thread 1 / Context 1          Thread 2 / Context 2
 --------------------          --------------------
 f = glFenceSync()
 glFlush()
 <------- application-side synchronization ------->
                               glWaitSync(f)

... and indeed they probably *have* to use this ordering to avoid
deadlocks in the GLX model, where all GL operations conceptually
go through a single connection to the X server. However, it's less
clear whether applications have to do this with other WSI (i.e. EGL).
Besides, even this sequence of GL commands can be translated into
the Gallium-level sequence outlined above when Gallium threading
and asynchronous flushes are used. So it makes sense to be more
robust.

As a side effect, we no longer busy-wait on submission_in_progress.

We won't enable asynchronous flushes on radeon, but add a
cs_add_fence_dependency stub anyway to document the potential
issue.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 14:00:22 +01:00
Nicolai Hähnle 1e5c9cf590 gallium: add PIPE_FLUSH_{TOP,BOTTOM}_OF_PIPE bits
These bits are intended to be used by the ddebug hang detection and are
named in analogy to the Vulkan stage bits (and the corresponding Radeon
pipeline event).

Hang detection needs fences on the granularity of individual commands,
which nothing else really covers. The closest alternative would have
been PIPE_QUERY_GPU_FINISHED, but (a) queries are a per-context object
and we really want a per-screen object, (b) queries don't offer a
wait with timeout, and (c) in any case, PIPE_QUERY_GPU_FINISHED is
meant to imply that GPU caches are flushed, which the new bits
explicitly aren't.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 13:58:16 +01:00
Nicolai Hähnle ea6df1ce37 gallium: add PIPE_FLUSH_ASYNC and PIPE_FLUSH_HINT_FINISH
Also document some subtleties of pipe_context::flush.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 13:58:16 +01:00
Nicolai Hähnle e3a8013de8 util/u_queue: add util_queue_fence_wait_timeout
v2:
- style fixes
- fix missing timeout handling in futex path

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 13:58:10 +01:00
Nicolai Hähnle f1a3648784 threads: update for late C11 changes
C11 threads were changed to use struct timespec instead of xtime, and
thrd_sleep got a second argument.

See http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1554.htm and
http://en.cppreference.com/w/c/thread/{thrd_sleep,cnd_timedwait,mtx_timedlock}

Note that cnd_timedwait is spec'd to be relative to TIME_UTC / CLOCK_REALTIME.

v2: Fix Windows build errors. Tested with a default Appveyor config
    that uses Visual Studio 2013. Judging from Brian's email and
    random internet sources, Visual Studio 2015 does have timespec
    and timespec_get, hence the _MSC_VER-based guard which I have
    not tested.

Cc: Jose Fonseca <jfonseca@vmware.com>
Cc: Brian Paul <brianp@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com> (v1)
2017-11-09 11:57:22 +01:00
Nicolai Hähnle c50743f61c gallium: remove unused and deprecated u_time.h
Cc: Jose Fonseca <jfonseca@vmware.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:57:22 +01:00
Nicolai Hähnle 222a2fb998 util: move os_time.[ch] to src/util
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:57:21 +01:00
Nicolai Hähnle f76a6cb337 radeonsi: always use async compiles when creating shader/compute states
With Gallium threaded contexts, creating shader/compute states is
effectively a screen operation, so we should not use context state.

In particular, this allows us to avoid using the context's LLVM
TargetMachine.

This isn't an issue yet because u_threaded_context filters out non-async
debug callbacks, and we disable threaded contexts for debug contexts.
However, we may want to change that in the future.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:53:20 +01:00
Nicolai Hähnle b650fc09c3 radeonsi: fix potential use-after-free of debug callbacks
Found by inspection.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:53:20 +01:00
Nicolai Hähnle dd7c273e87 radeonsi: move pipe debug callback to si_context
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
2017-11-09 11:53:19 +01:00