ci/turnip: Increase the hangcheck timer to 2 seconds.

We get a lot of useful coverage from running graphicsfuzz with spilling
enabled, but it's also pretty slow and can cause intermittent hangcheck
failures.  I thought I'd categorized them when merging !14839 (device loss
on reset), but it looks like not all of them and we're now more likely to
have flakes take out the whole test run when a single flake makes the rest
of the caselist a flake.

This is a little unfortunate in that it means our test environment is not
the same as a stock system you would want to run deqp on to submit
conformance, but I think it's an improvement in the test maintenance work
vs needing to fix things up later.

We have some other tests besides turnip that can trigger hangchecks which
we might also like this increase for (some disabled traces, for example).
However, freedreno GL has a 5-second timeout waiting for idle when
mapping, and a couple of 2-second timeouts in a row can result in spurious
failures in other tests!

Fixes: #6163
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15435>
This commit is contained in:
Emma Anholt 2022-03-16 15:50:00 -07:00 committed by Marge Bot
parent 0cbe4dd4c4
commit f831ba238f
4 changed files with 14 additions and 0 deletions

View File

@ -49,6 +49,7 @@ for var in \
FDO_UPSTREAM_REPO \
FD_MESA_DEBUG \
FLAKES_CHANNEL \
FREEDRENO_HANGCHECK_MS \
GALLIUM_DRIVER \
GALLIVM_PERF \
GPU_VERSION \

View File

@ -9,6 +9,7 @@ cd /
mount -t proc none /proc
mount -t sysfs none /sys
mount -t debugfs none /sys/kernel/debug
mount -t devtmpfs none /dev || echo possibly already mounted
mkdir -p /dev/pts
mount -t devpts devpts /dev/pts

View File

@ -38,6 +38,12 @@ if [ "$HWCI_FREQ_MAX" = "true" ]; then
test -z "$GPU_AUTOSUSPEND" || echo -1 > $GPU_AUTOSUSPEND || true
fi
# Increase freedreno hangcheck timer because it's right at the edge of the
# spilling tests timing out (and some traces, too)
if [ -n "$FREEDRENO_HANGCHECK_MS" ]; then
echo $FREEDRENO_HANGCHECK_MS | tee -a /sys/kernel/debug/dri/128/hangcheck_period_ms
fi
# Start a little daemon to capture the first devcoredump we encounter. (They
# expire after 5 minutes, so we poll for them).
./capture-devcoredump.sh &

View File

@ -22,6 +22,9 @@
variables:
DEQP_VER: vk
VK_DRIVER: freedreno
# Increase the hangcheck timer for our spilling tests which bump up against
# the .5s default.
FREEDRENO_HANGCHECK_MS: 2000
.freedreno-test-traces:
extends:
@ -150,6 +153,9 @@ a618_vk:
BOOT_METHOD: depthcharge
KERNEL_IMAGE_TYPE: ""
RUNNER_TAG: mesa-ci-x86-64-lava-sc7180-trogdor-lazor-limozeen
# Increase the hangcheck timer for our spilling tests which bump up against
# the .5s default.
FREEDRENO_HANGCHECK_MS: 2000
a618_vk_full:
extends: