Commit Graph

13 Commits

Author SHA1 Message Date
Emma Anholt cde8c92ab6 ci/bare-metal: Add timeouts to the shell commands called in fastboot.
It seems that we sometimes stall out executing "fastboot boot", and if
that happens we want to reboot the board and try again.

Fixes: #6682
Acked-by: David Heidelberg <david.heidelberg@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17607>
2022-07-19 21:05:07 +00:00
Emma Anholt 5f09b1ebe9 ci/bare-metal: Add test phase timeouts to all boards.
This should help with "marge got stuck for an hour and all I got was this
failed job with no results/" when a system intermittently wedges.

This replaces the BM_POE_TIMEOUT ("did we get something on serial in the
last 3 minutes?") that rpi had, in favor of checking that the whole test
job gets through in 20 minutes.

Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
2022-06-21 21:38:25 +00:00
Emma Anholt ca453714aa ci/bare-metal: Add per-boot-stage timeouts for fastboot and poe.
This should avoid the 1-hour timeouts if something goes wrong, and just
restart.

Fixes: #6682
Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
2022-06-21 21:38:25 +00:00
Emma Anholt 1e15ec1949 ci/bare-metal: Apply autopep8 to our python scripts.
My editor likes to pep8 as I edit, and I'm tired of carefully not
committing those changes.

Acked-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>
2022-06-21 21:38:25 +00:00
Ilia Mirkin 268fc8e5c1 gitlab-ci: detect a3xx gpu hang recovery failure
But don't bail immediately, instead print out some more lines after the
hang, hopefully catching info about the cause of the hang.

Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14033>
2021-12-03 23:26:27 +00:00
Emma Anholt 8f5a0bd9b4 ci/bare-metal: Close serial and join serial threads before exit.
This should fix the intermittent (~1/week) cheza failure where python
complains that a thread tried to do stdio while the main thread has
exited.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13462>
2021-11-10 20:36:57 +00:00
Emma Anholt 306a039472 ci/baremetal: Retry if our network device spontaneously fails.
Seen in https://gitlab.freedesktop.org/mesa/mesa/-/jobs/13824132.  It's
unlikely that graphics would kill the network, so just assume it's not our
fault and keep going.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12939>
2021-09-20 19:55:55 +00:00
Daniel Stone 5f32d2a438 ci: Consistent pass/fail result output
One less point of differentiation.

Signed-off-by: Daniel Stone <daniels@collabora.com>
Acked-by: Martin Peres <martin.peres@mupuf.org>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11337>
2021-06-15 14:02:44 +02:00
Emma Anholt 6cfd1298e1 ci/fastboot: Consistently restart the run on intermittent conditions.
Not currently on my list of intermittent issues, but let's be
resilient hopefully.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11308>
2021-06-11 20:24:55 +00:00
Emma Anholt fe70badfc3 ci/fastboot: Add a serial timeout to catch fastboot prompt failure.
The a530s will occasionally fail to make it to the fastboot prompt,
with no other deltas between a working log and a log stalled waiting
for that line to show up.

So, add a serial timeout (like the rpi boards do for similar reasons),
and on timeout restart the run.  We actually restart the whole serial
watching process, because the SerialBuffer finishes itself on timeout.
This should also help with the intermittent issue we've had where a
power cycle causes the python serial module to throw an exception.

Tested with the gitlab-disabled db820c that never makes it to the
fastboot prompt (I think it's one where we need a longer micro cable
to connect it!) and saw successful boot looping to retry.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11308>
2021-06-11 20:24:55 +00:00
Eric Anholt 1af7be02d7 ci/bare-metal: Move the db820c lockup detect to the right boot script.
Fixes: 2407952ec9 ("ci/bare-metal: Restart a run on intermittent kernel lockups.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9715>
2021-03-19 22:07:57 +00:00
Juan A. Suarez Romero e45d372968 ci/baremetal: highlight message errors
Highlight in red errors from the baremetal run, so user is more aware of
what happened.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9335>
2021-03-01 18:22:24 +00:00
Eric Anholt fd2ee49b21 ci/bare-metal: Use python for handling fastboot booting and parsing
Modeling after what I did for cros_servo_run.py, this gives us easy
support for restarting the test run a530 when we detect a spontaneous
reboot.  I had to touch up serial_buffer.py to handle buffering in from a
file instead of a serial device, to support the upcoming etnaviv CI
(tested by running it against a serial log from db410c and seeing it step
to calling "fastboot")

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>
2020-09-03 23:22:44 +00:00