Adapted from Chris Wilson's patch. The comment is largely his.
Currently, when iris hangs the GPU, it will continue sending batches
which incrementally update the state, assuming it's preserved across
batches. However, the kernel's GPU reset support reinitializes the
guilty context to the default GPU state (reasonably not wanting to
trust the current state). This ends up resetting critical things
like STATE_BASE_ADDRESS, causing memory accesses in all subsequent
batches to be garbage, and almost certainly result in more hangs
until we're banned or we kill the machine.
We now ask the kernel to ban our render context immediately, so we
notice we've gone off the rails as fast as possible. Eventually, we'll
attempt to recover and continue. For now, we just avoid torching the
GPU over and over.
Ofc legacy gl features that are broken don't trigger fails in deqp. I
should remember to test glxgears more often.
Fixes: 7ff6705b8d freedreno/ir3: convert to "new style" frag inputs
Signed-off-by: Rob Clark <robdclark@chromium.org>
We didn't notice this issue much because the 2 struct share a similar
layout, expect for the additional fields...
We run into that issue in Anv :
==15236== Invalid write of size 8
==15236== at 0x8CF3939C: anv_state_table_expand_range (anv_allocator.c:211)
==15236== by 0x8CF394D5: anv_state_table_grow (anv_allocator.c:264)
==15236== by 0x8CF3967E: anv_state_table_add (anv_allocator.c:312)
==15236== by 0x8CF3B13C: anv_state_pool_alloc_no_vg (anv_allocator.c:1167)
==15236== by 0x8CF3B2B0: anv_state_pool_alloc (anv_allocator.c:1190)
==15236== by 0x8CF60871: alloc_surface_state (anv_image.c:1122)
==15236== by 0x8CF61FF9: anv_CreateImageView (anv_image.c:1519)
==15236== by 0x8BCBD2ED: vkCreateImageView (trampoline.c:1358)
==15236== Address 0x8994ef10 is 0 bytes after a block of size 128 alloc'd
==15236== at 0x4C2FB0F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==15236== by 0x8D2578E6: u_vector_init (u_vector.c:47)
==15236== by 0x8CF3929A: anv_state_table_init (anv_allocator.c:168)
==15236== by 0x8CF3A99A: anv_state_pool_init (anv_allocator.c:921)
==15236== by 0x8CF56517: anv_CreateDevice (anv_device.c:1909)
==15236== by 0x8BCB4FBA: terminator_CreateDevice (loader.c:6073)
==15236== by 0x8DD2CB3D: ??? (in /home/djdeath/.steam/ubuntu12_64/libVkLayer_steam_fossilize.so)
==15236== by 0x8DF4D241: vkCreateDevice (in /home/djdeath/.steam/ubuntu12_64/steamoverlayvulkanlayer.so)
==15236== by 0x8BCB35C6: loader_create_device_chain (loader.c:5449)
==15236== by 0x8BCBC230: vkCreateDevice (trampoline.c:838)
v2: Rename mmap_cleanups to avoid confusion (Caio)
v3: s/fail_mmap_cleanups/fail_cleanups/ (Caio)
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110648
Cc: <mesa-stable@lists.freedesktop.org>
Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
When first implemented in fefd03e16c Mesa's behavior was aligned on behavior
of Nvidia's driver. This caused a failing test in piglit but was ok since the
specification is unclear on this subject.
Nvidia's driver behavior has been modified because using version 410.104, the
problematic test (program_binary_retrievable_hint) now passes.
This commit defers BinaryRetrievableHint update until the next linking so the
test passes on Mesa as well.
Signed-off-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Signed-off-by: Marek Olšák <marek.olsak@amd.com>
I don't know why I thought NIR_PASS always set the progress variable.
Derp.
Fixes: d41cdef2a5 ("nir: Use the flrp lowering pass instead of nir_opt_algebraic")
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Timothy Arceri <tarceri@itsqueeze.com>
Reviewed-by: Emil Velikov <emil.velikov@collabora.com>
Coverity CID: 1444996
Coverity CID: 1444995
Coverity CID: 1444994
Coverity CID: 1444993
Coverity CID: 1444991
Coverity CID: 1444989
We need to know the number of rectangles.
This fixes new CTS dEQP-VK.draw.discard_rectangles.dynamic_*.
Fixes: 5db0bf9994 ("radv: Implement VK_EXT_discard_rectangles.")
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Propagate the failure from GEM_EXECBUFFER2, cleanup then report failure
if need be. We retain the current behaviour to abort() at the first sign
of trouble -- for a non-robustness context, arguably this is the right
thing to do as the client cannot recover, and the system state is lost.
How to properly integrate with KHR_robustness and reset-strategy is
left as a future exercise.
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Pull i915_drm.h to include
kernel commit ba4fda620a5f7db521aa9e0262cf49854c1b1d9c
Author: Chris Wilson <chris@chris-wilson.co.uk>
Date: Mon Feb 18 10:58:21 2019 +0000
drm/i915: Optionally disable automatic recovery after a GPU reset
for improved resilience in handling GPU hangs.
Signed-off-by: Eric Engestrom <eric.engestrom@intel.com>
Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
[ Michel Dänzer: Take changes affecting the docker image from !299,
plus remove the unzip package again before generating the image ]
And consolidate it all into a single job.
It doesn't take much longer than a single version, thanks to ccache.
Overall, this single job might be faster or at least use fewer CPU
cycles than the two jobs before, while covering thrice as many versions
of LLVM.
v2:
* Move "rm -rf _build" to meson-build.sh.
* Set GALLIUM_DRIVERS the same way both times in the meson-clover job,
for symmetry.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com> # v1
No functional change intended (except for no longer running meson
--version separately, as the version appears early in meson's output
anyway).
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We really shouldn't ever need a suffix, otherwise it indicates a failure
in coordination. :) In which case, it doesn't really matter how the tag
is disambiguated.
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
meson git now has a cmake find method for llvm, but it lacks a couple of
features that we use from the config tool version. Until that reaches
parity we need to use the config-tool version.
CC: 19.0 19.1 <<mesa-stable@lists.freedesktop.org>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Once mem->bo is removed from the cache, it is likely to be freed.
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Fixes: b80930a6fe ("anv: add support for VK_EXT_memory_budget")
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We use a mix of MI & PIPE_CONTROL commands to write our queries' data
(results & availability). Those commands' memory write order is not
guaranteed with regard to their order in the command stream, unless CS
stalls are inserted between them. This is problematic for 2 reasons :
1. We copy results from the device using MI commands even though
the values are generated from PIPE_CONTROL, meaning we could
copy unlanded values into the results and then copy the
availability that is inconsistent with the values.
2. We allow the user to poll on the availability values of the
query pool from the CPU. If the availability lands in memory
before the values then we could return invalid values.
This change does 2 things to address this problem :
- We use either PIPE_CONTROL or MI commands to write both
queries values and availability, so that the ordering of the
memory writes guarantees that if availability is visible,
results are also visible.
- For the occlusion & timestamp queries we apply a CS stall
before copying the results on the device, to ensure copying
with MI commands see the correct values of previous
PIPE_CONTROL writes of availability (required by the Vulkan
spec).
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reported-by: Iago Toral Quiroga <itoral@igalia.com>
Cc: mesa-stable@lists.freedesktop.org
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
It's generally frowned upon to have more than one H1 per document in
HTML4. So let's put the text directly inside the header. This means we
can drop the flex-based centering, which makes things a bit easier. We
also need to change the padding to rem instead of em, because the em has
now changed.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We're pretty insonsistent in what the headings and titles are, especially
compared to what the articles are listed as in the sidebar. Let's
harmonize this.
There's a notable exception for meson.html, where the sidebar uses a
short-hand form that makes sense in the sidebar, but not in the article
due to the visible context being different.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
It's generally frowned upon to have multiple H1 headings in HTML4. So
let's make sure each article has a primary heading for the article, and
that that heading is the title that is used in the sidebar.
While we're at it, let's update the title in the articles to match the
title from the sidebar as well.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
It's generally frowned upon to have multiple H1 headings in HTML4. So
let's add a primary heading for the article, and source that from the
title used in the sidebar.
While we're at it, let's update the title in the article to match the
title from the sidebar as well.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We generally use title-casing for headings in the sidebar. But not
all headings was constently cased like that. Let's make sure this
is consistent.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Acked-by: Eric Engestrom <eric.engestrom@intel.com>
There's no need to keep this short, we can just spell out "and" here.
Besides, a slash kind of implies "or", but these articles are about
both of these, not either.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
It's quite visible that there's more docs below, we don't need to spell
it out for the reader.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We're not short on space here, so there's little point in abbreviating
this. This also matches the heading in the article.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
We're not short on space here, so let's just spell out "and" instead of
using the ampersand. This is more consistent with the entry above in the
sidebar.
Signed-off-by: Erik Faye-Lund <erik.faye-lund@collabora.com>
Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
This ports commit 9e7b0988d6 from anv
to i965. Thanks to Lionel for noticing that it was missing!
Fixes: 01058a5522 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
This should happen regardless, but let's be paranoid.
Fixes: 01058a5522 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
The STATE_BASE_ADDRESS "Size" fields can only hold 0xfffff in pages,
and 0xfffff * 4096 = 4294963200, which is 1 page shy of 4GB.
So we can't use the top page.
Fixes: 01058a5522 i965: Add virtual memory allocator infrastructure to brw_bufmgr.
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>