ci/lava: Trap init-stage2.sh background processes

Any daemon executed in init-stage2.sh may interfere with LAVA signals,
since any threaded output to console may clutter the signals, which are
based on the log output.
E.g: This job
https://gitlab.freedesktop.org/gallo/mesa/-/jobs/20779120#L2102
has failed because capture-devcoredump.sh was alive and emitting kernel
messages to the console during the LAVA signal handling, mangling the
output and making the LAVA to fail to check the results of the job.

Another problem is that CONFIG_DEBUG_STACK_USAGE Kconfig is enabled.
This causes process exit to dump a `RESULT=[  246.756067] lava-test-case
(156) used greatest stack depth: ... bytes left` kernel message to the
logs corrupting LAVA signal message. Empirically, it happens one in
every 280 jobs. To solve that, compose the lava-test-case custom script
with a short sleep to give time for kernel to dump the message clearly
and a exit command to keep the return code from init-stage2.sh script.

Signed-off-by: Guilherme Gallo <guilherme.gallo@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/15938>
This commit is contained in:
Guilherme Gallo 2022-04-07 16:01:27 -03:00 committed by Marge Bot
parent 09236d9607
commit e9aef19e2b
2 changed files with 30 additions and 2 deletions

View File

@ -1,5 +1,31 @@
#!/bin/sh
# Make sure to kill itself and all the children process from this script on
# exiting, since any console output may interfere with LAVA signals handling,
# which based on the log console.
cleanup() {
set +x
echo "Killing all child processes"
for pid in $BACKGROUND_PIDS
do
kill "$pid"
done
# Sleep just a little to give enough time for subprocesses to be gracefully
# killed. Then apply a SIGKILL if necessary.
sleep 5
for pid in $BACKGROUND_PIDS
do
kill -9 "$pid" 2>/dev/null || true
done
}
trap cleanup INT TERM EXIT
# Space separated values with the PIDS of the processes started in the
# background by this script
BACKGROUND_PIDS=
# Second-stage init, used to set up devices and our job environment before
# running tests.
@ -75,7 +101,8 @@ fi
# Start a little daemon to capture the first devcoredump we encounter. (They
# expire after 5 minutes, so we poll for them).
./capture-devcoredump.sh &
/capture-devcoredump.sh &
BACKGROUND_PIDS="$! $BACKGROUND_PIDS"
# If we want Xorg to be running for the test, then we start it up before the
# HWCI_TEST_SCRIPT because we need to use xinit to start X (otherwise
@ -85,6 +112,7 @@ if [ -n "$HWCI_START_XORG" ]; then
echo "touch /xorg-started; sleep 100000" > /xorg-script
env \
xinit /bin/sh /xorg-script -- /usr/bin/Xorg -noreset -s 0 -dpms -logfile /Xorg.0.log &
BACKGROUND_PIDS="$! $BACKGROUND_PIDS"
# Wait for xorg to be ready for connections.
for i in 1 2 3 4 5; do

View File

@ -47,4 +47,4 @@ artifacts/lava/lava_job_submitter.py \
--visibility-group ${VISIBILITY_GROUP} \
--lava-tags "${LAVA_TAGS}" \
--mesa-job-name "$CI_JOB_NAME" \
>> results/lava.log
>> results/lava.log