This got lost in the move away from hardcoded environment variables, and
fixes the Iris EGL tests.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Acked-by: Martin Peres <martin.peres@mupuf.org>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11337>
This should probably be set in the trace-job environments, but the
inheritance is a bit of a mess between all the systems at the moment,
and this matches previous hardcoded behaviour at least.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Acked-by: Martin Peres <martin.peres@mupuf.org>
Acked-by: Emma Anholt <emma@anholt.net>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11337>
The test involves timestamping to figure out how long a swap actually
takes, but if anything ends up rescheduling the process you can end up
spuriously failing. I could easily reproduce flakiness by just running a
loop accessing the filesystem in parallel with a loop running the test.
So, it's certainly not usable on a CI system with other piglit tests
running in parallel, and we don't want to run it if it's going to just
produce flake noise.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11333>
This will also give us a central place to handle known CI issues for
piglit.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11333>
The flakiness of this test is due to CI running deqp in parallel, rather
than exposing any underlying driver issue. Just skip it in CI until we
come up with a reasonable way to handle tests to be run in isolation
during a deqp-runner run (likely as part of
https://gitlab.freedesktop.org/anholt/deqp-runner/-/issues/7).
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11333>
They're too slow to run in CI even on non-tiled renderers, they don't
block conformance (unless you crash), and provide unreliable warning
results unless you isolate them from other activity on the system.
This means that the following jobs now skip these tests:
- deqp-iris-*
- deqp-llvmpipe (you know, the one mentioned in the comment!)
- deqp-virgl-gl
- deqp-zink-lvp
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11333>
The mustpass doesn't have any tests matching these, so no need to
skip. These tests only show up if you run without using a mustpass list.
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11333>
When adding crosvm to the x86_test-gl building deqp-runner was also
mistakenly introduced. deqp-runner is already included in the
x86_test-base image.
Additionally, when bumping the deqp-runner version, only the
x86_test-gl tag was updated.
Now, we remove the unnecessary build from x86_test-gl and bump the tag
for the x86_test-base image.
v2:
- Bump x86_test-gl, not x86_test-vk (Tomeu).
v3: add in fixes for duplicated lines in lvp xfails (Anholt)
Fixes: dc9cd18f52 ("ci: Build Crosvm in our container")
Fixes: 53826932db ("ci: Update piglit and deqp/piglit-runner.")
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Corentin Noël <corentin.noel@collabora.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11359>
The screensaver kicks in at 10 minutes and obscures the screen,
independent of dpms. This causes piglit tests to get flaky (swaps start
taking a whole second, and swapbuffersmsc-divisor-zero times out at
exactly the wrong time) and slow if the run takes longer than 10 minutes.
Hopefully with this we'll see some piglit glx flakes go away forever, it
did seem to for this test locally.
Reviewed-by: Adam Jackson <ajax@redhat.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11334>
The a530s will occasionally fail to make it to the fastboot prompt,
with no other deltas between a working log and a log stalled waiting
for that line to show up.
So, add a serial timeout (like the rpi boards do for similar reasons),
and on timeout restart the run. We actually restart the whole serial
watching process, because the SerialBuffer finishes itself on timeout.
This should also help with the intermittent issue we've had where a
power cycle causes the python serial module to throw an exception.
Tested with the gitlab-disabled db820c that never makes it to the
fastboot prompt (I think it's one where we need a longer micro cable
to connect it!) and saw successful boot looping to retry.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11308>
I was today years old when I learned this about classic composable UNIX
tools:
~/mesa/mesa lava-submitter-overlay * % bash
[daniels@strictly mesa]$ set -e
[daniels@strictly mesa]$ false | tee
[daniels@strictly mesa]$ echo $?
0
Use tail rather than tee, so it doesn't hide our exit status.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
Frequency of writes is unlikely to be a performance bottleneck, and
given the number of steps in between execution and the user, less
buffering is gooder.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
Truth is relative in 2021, and Python's duck-typing means truthiness
isn't what you think it is. Use an explicit fatal-error handler to make
sure we crash out hard on failure, rather than hoping sys.exit() behaves
like you think it does, because it doesn't.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
Trying to get arbitrary strings suitably quoted for shell, embedded in a
YAML file, processed by Python templating, is like seven bad ideas all
embedded into one big can of bees.
Reuse the same script we use for bare-metal to generate the environment,
tar that up into a per-job overlay which is added to the
inter-pipeline-reusable rootfs built by the container jobs and the
intra-pipeline-reusable overlay built by the build jobs.
@anholt wrote a chunk of this - replacing the $ENV_VARS GitLab CI
variable with a Python loop across the POSIX job environment - in
!11192, but this still had YAML quoting nightmares, and was more
needless duplication between LAVA and bare-metal.
The diff is large and annoying, but is mostly a sed job to get
ENV_VARS="FOO=bar BAZ=quux" into FOO: bar\nBAZ: quux.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Co-authored-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
Used for both LAVA (uploading results to MinIO because we don't yet have
non-ephemeral NFS storage) and Piglit (for the Tracie dashboard).
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
Our variable names haven't aged very well. Rename them to make them more
clear and straightforward, especially when we bring in a third rootfs
element to download.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
Split our init up into: base system setup (filesystem mounts, network),
pulling the build artifacts, environment common to us and bare-metal,
bespoke environment, and finally running the tests.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
As the JWT is sensitive, we don't want to record or leak it anywhere.
Doing this lets us run --dump-yaml in normal execution so we can
artifact the result, as well as bringing us into line with bare-metal.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11309>
This brings in some major new features in the runner:
- piglit tests now include subtest reporting
- "-t" support for quick include-filtering of tests.
- piglit tests that crash after their result report are considered crashes.
- throws a nice error if you try to annotate the same failure twice
(e.g. lvp's dEQP-VK.glsl.builtin.precision.pow.highp.vec2,Fail)
Since the runner catches piglit test bugs where the same subtest is run
twice, we also uprev piglit to pull in the fixes for those.
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11283>
We no longer name the template by the test suite being run.
Fixes: 93ec399b28 ("ci: Use a single template for LAVA jobs")
Acked-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11293>
LAVA doesn't consider failure to download a kernel/initramfs as an
infrastructure error, rather just a user error for supplying a broken
URL. We know our URLs aren't broken (because we're perfect), so assume
that failures in download validation are network issues and retry when
we hit them.
LAVA itself has been fixed to retry internally, so we'll get that when
upgrade in a couple of weeks, but gloss over it for now.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11218>
Each step in a LAVA job returns separate results; a successfully-retired
job can have about 12 entries. Make sure we iterate through all of them
when we're looking for infrastructure errors.
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11218>
We added lava_job_submitter.py to improve our robusteness, but
the final result reporting was not handled correctly by the script.
This change fix it by properly calling sys.exit() on failures.
Signed-off-by: Gustavo Padovan <gustavo.padovan@collabora.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11218>
And also add the required bits to the x86_64 kernel.
syslogd is needed by Crosvm.
iptables is needed to route packets in and out the VM.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10862>
hang-detection is a vulkan-based lightweight wrapper from
parallel-deqp-runner that periodically submits empty command buffers
and waits for their completions. If the completion never happens, the
GPU is considered hung, the wrapped script is killed, and the job
should get aborted.
This should have no negative impact on the runtime of dEQP/traces/...,
but will allow saving time when the GPU gets hung as we can abort the
job immediately rather than waiting for the timeout.
In the case of B2C, we are using this tool's error message as a way to
trigger the reboot of the test machine and start again.
v2:
- Use hang-detection already with some jobs (Martin).
Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11087>
When we remove the contents of the results directory, we `cd` into it.
The script expects that $PWD is /piglit, and $OLDPWD is the Mesa build
directory, however the cd into the results directory will make $OLDPWD
be $BUILDDIR/results.
This means that Piglit emits into results/results/ which looks weird,
but more importantly also fails OpenCL Piglit execution, because we
can't find our baseline result expectations.
Fix it by using an explicit variable rather than relying on history.
Fixes: 683ddf19dc ("ci: remove results directory content only with piglit runners")
Ref: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10856
Signed-off-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Martin Peres <martin.peres@mupuf.org>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11126>
So we don't need to provision aarch64 servers, which are these days
rarer than x8_64.
In the switch to the new runner tags, switch to one which contains the
device type, so we can dimension the runner jobs taking into account the
number of DUTs available.
Signed-off-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Emma Anholt <emma@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11108>
Following the rest of our channels, move CI reporting over. Seems to
still work fine. This affects freedreno and iris.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11020>
Now, flakes that aren't in the *-flakes.txt get a "NEW" in their report so
I can watch for them.
The bash was unwieldy and made debugging hard, so I switched to python.
Reviewed-by: Alyssa Rosenzweig <alyssa.rosenzweig@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11020>