LAVA doesn't consider failure to download a kernel/initramfs as an
infrastructure error, rather just a user error for supplying a broken
URL. We know our URLs aren't broken (because we're perfect), so assume
that failures in download validation are network issues and retry when
we hit them.
LAVA itself has been fixed to retry internally, so we'll get that when
upgrade in a couple of weeks, but gloss over it for now.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11218>
Each step in a LAVA job returns separate results; a successfully-retired
job can have about 12 entries. Make sure we iterate through all of them
when we're looking for infrastructure errors.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11218>
We added lava_job_submitter.py to improve our robusteness, but
the final result reporting was not handled correctly by the script.
This change fix it by properly calling sys.exit() on failures.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11218>
Covert the job submission process to a python script for more
robustness and control. allowing easier manipulation of job data.
As a result, it adds retry logic to deal with Infrastructure Errors in LAVA.
_call_proxy() is equipped with a robust retry logic, which I have been
using already in the past few weeks in stress testing to run hundreds
of jobs.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11079>