Commit Graph

92 Commits

Author SHA1 Message Date
Eric Anholt 7f9f345579 ci/lava: Return the run's results/ artifacts from the DUTs.
Finally LAVA users will be able to see deqp XMLs on failures from the
job's artifacts browser.  This replaces a couple of one-off minio uploads
in the piglit runner.

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/10297>
2021-04-19 16:46:33 +00:00
Juan A. Suarez Romero 515ffe4af4 ci/broadcom: use SNMP to turn on/off devices
So far we were using a telnet-based script to communicate with the PoE
Switch to turn on/off the network ports the DUTs are connected.

But this script does not seem very reliable because from time to time the
switch fails to execute the steps in the script.

As the PoE Switch we use is a smart one with support for SNMP protocol,
it would be easier to use it to handle it, which allows to turn on/off
the ports without going through the nasty telnet steps

Acked-by: Michel Dänzer <mdaenzer@redhat.com>
Reviewed-by: Jose Maria Casanova Crespo <jmcasanova@igalia.com>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9954>
2021-04-13 13:16:26 +02:00
Christian Gmeiner 0bd01e86f3 ci/bare-metal: no need to use tee
fastboot_run.py will watch results/serial-output.txt.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9976>
2021-04-03 18:56:13 +00:00
Eric Anholt 1af7be02d7 ci/bare-metal: Move the db820c lockup detect to the right boot script.
Fixes: 2407952ec9 ("ci/bare-metal: Restart a run on intermittent kernel lockups.")
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9715>
2021-03-19 22:07:57 +00:00
Eric Anholt 2407952ec9 ci/bare-metal: Restart a run on intermittent kernel lockups.
Since enabling SMP on db820c and cranking up how many tests we run, we've
been seeing lockups like this a couple of times a week.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9655>
2021-03-17 17:13:22 +00:00
Eric Anholt f3a7a8a4dc ci/freedreno: Switch the piglit testing to the new piglit runner.
Getting piglit to fit onto our test devices was proving difficult, and we
need the ability to handle flakes, so switch to the rust piglit runner
that @pepp wrote as part of the deqp-runner repo which gives us flake
detection, sharding across boards, fractional runs, and almost half the
runtime.

It doesn't handle piglit subtests yet, but if you can't run piglit's
python on your devices because it's too bloated and unstable, this is a
way forward.

Reviewed-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9468>
2021-03-16 22:19:30 +00:00
Juan A. Suarez Romero 3f1c375581 ci/broadcom: allow custom kernels
So far, testing VC4 and V3D/V3DV requires the CI runners having access
to a Raspberry Pi 3/4 kernel, and the correspondent modules and
bootloader files. If a different kernel must be used, it means touching
the runners to provide them.

This commit adds the option to define an URL pointing to a (compressed)
tarball containing such files, without requiring dealing with the
runners. This link is provided through the `BM_BOOTFS` job variable.

The tarball must contain two directories in the root: a `/boot`
directory (containing the kernel, DTBs and bootloader files), and a
`/lib/modules` (or `/usr/lib/modules`) with the kernel modules.

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9527>
2021-03-12 11:03:17 +00:00
Christian Gmeiner 8413de5db0 ci/bare-metal: fix fastboot
Only copy results from NFS if BM_FASTBOOT_NFSROOT is set.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9482>
2021-03-09 23:09:10 +00:00
Eric Anholt bcdfee3bcd ci/freedreno: Switch the fastboot boards to using nfsroot.
This saves time in packing the rootfs, allows for larger rootfses, and
avoids the need for webdav.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9314>
2021-03-03 21:05:39 +00:00
Juan A. Suarez Romero e0b0507635 ci/broadcom: retry always when serial log timeout
So far we were retrying the testing (through device rebooting) if we did
not detect the boot sequence.

But found a couple of times that the serial log can also be "lost"
during the testing process. In all those times a manual retry of the job
was enough to complete the test.

Thus, let's apply the retry once automatically in this case.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9335>
2021-03-01 18:22:24 +00:00
Juan A. Suarez Romero e45d372968 ci/baremetal: highlight message errors
Highlight in red errors from the baremetal run, so user is more aware of
what happened.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9335>
2021-03-01 18:22:24 +00:00
Juan A. Suarez Romero e814e23f59 ci/piglit: allow parallel piglit jobs
This allows to split a piglit job in several parallel jobs, to speed up
the execution.

Due piglit restrictions, this only works for single profiles. Otherwise
an error will be shown in the runner.

Also, a new gitlab job variable `PIGLIT_TESTS` is introduced that
contains the excluded/included tests with `-x` or `-n`. The rest of the
piglit options go to `PIGLIT_OPTIONS` (like `--timeout n`).

v2 (Andres):
 - Replay profile is supported in parallel jobs.
 - Bail out inmediately if parallel jobs is tried with multiple
profiles.
 - Use testlist only when doing parallel jobs.
 - Do not drop pass tests when filtering executed tests.
 - Get rid of PIGLIT_FRACTION.

v4:
 - uncommit unrelated change (Andres).

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Andres Gomez <agomez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9022>
2021-02-24 09:41:33 +01:00
Eric Anholt 9500341847 ci/freedreno: Add a fractional gles31 run with asan enabled.
We have to disable the GLSL unit tests because with asan it runs way too
much code under qemu and times out.  Those unit tests have coverage on
x86, anyway.

I also included a vulkan run, which is disabled by default due to timeouts
that I need to sort out still.  It should be a useful tool for turnip
devs, though.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9070>
2021-02-18 00:49:00 +00:00
Andres Gomez 7393303637 ci: correct artifacts location for piglit's runner messages
We are now using pages.

v2:
  - Define a helper variable for the artifacts base URL (Juan).

Signed-off-by: Andres Gomez <agomez@igalia.com>
Acked-by: Eric Anholt <eric@anholt.net> [v1]
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9092>
2021-02-17 14:37:35 +00:00
Juan A. Suarez Romero d8236e32de ci/v3d: Add V3D and V3DV testing
Add OpenGL and Vulkan testing for V3D and V3DV respectively.

Add also a couple of manual piglit jobs for V3D.

v2:
 - Replace custom mustpass with running fraction of tests (Eric)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8745>
2021-02-15 15:47:31 +00:00
Juan A. Suarez Romero 5d0c96a8c2 ci: add option to overwrite CPU arch
When loading Vulkan ICD file, it uses the CPU machine identifier to
load the correct one, in case multiple versions are installed.

This is fine if the machine where Mesa has been built and the machine
where the test is run are exactly the same. But this is not always the
case. As example, for armhf architecture, the machine where Mesa is
built is identified as `arm7hlf`, but the Raspberry Pi 4 is identified
as `armv7l`, so it will fail to load the ICD file, though both are
totally compatible.

This allow to define the architecture instead.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8745>
2021-02-15 15:47:31 +00:00
Eric Anholt 48dd9b7e34 ci/deqp: Bump runner to 0.5.1 for recent runtime perf improvements.
3 commits in 0.5.0:

- 20-40s savings on many of our CI runs by dropping the clever test size
  scaling code.

- Even bigger savings (especially on deqp-vk runs) by increasing maximuim
  test group size (~1/4 of runtime was spawning deqp on cheza, that cost
  is cut by ~75%)

- No more needing to manually set MESA_DEBUG=silent

2 commits in 0.5.1:

- Fixed automatic thread pool sizing to keep all CPUs busy (thanks for
  catching that Bas!).

- Automatically size down test groups on short test lists and many CPUs,
  so split the list evenly between CPUs (such as on freedreno -options
  jobs).

Acked-by: Daniel Stone <daniel@fooishbar.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8787>
2021-02-07 21:42:39 -08:00
Eric Anholt 23350b5939 ci/freedreno: Do our piglit runs against Xorg.
We were using surfaceless, which misses out on some useful coverage we'd
like to have in the GLX/EGL piglit tests, but more importantly prevented
many traces from running.

Reviewed-by: Daniel Stone <daniel@fooishbar.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8727>
2021-02-01 18:32:01 +00:00
Eric Anholt 2af6b53588 ci/freedreno: Use the http cache for artifacts downloads, too.
The gitlab artifacts handling has been slow in the past as we hit
gitlab.fdo from multiple runners, and it costs fd.o egress bandwidth.  Use
the local http cache against the packet.net minio to cut that downloads
cost.

Closes: #3249
Reviewed-by: Daniel Stone <daniel@fooishbar.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8727>
2021-02-01 18:32:01 +00:00
Eric Anholt ce1bb26b06 ci/freedreno: Detect cheza HFI errors and restart the run.
These are intermittent (~1/day), seem to be around GPU faults (so
hopefully will go away once we clean up piglit's fault errors), and are
probably also related to our vintage firmware.  Until we can get new
hardware in the farm, just restart the flaked job.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8722>
2021-01-26 19:17:27 +00:00
Juan A. Suarez Romero ea88e1c820 ci/vc4: allow custom timeout values for activity
The script that monitors activity in the serial assumes that something
was wrong if it does not detect activity in 60 seconds, rebooting the
device and re-trying the test again.

While this timeout is enough for most cases, in some cases it is not
enough. For instance, when executing piglit testsuite it takes quite a
few time to generate the results after the test is done.

This allow to setup a custom timeout (`BM_POE_TIMEOUT`) in the proper
jobs.

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8702>
2021-01-26 09:45:29 +00:00
Juan A. Suarez Romero 8588fb65d6 vc4/ci: Replace expect script by python script
Replace the expect-based script to turn on/off the Raspberry Pi devices
using a python-based script.

v2:
 - Fix small nitpicks (Juan)
 - Limit line length (Andres)

v3:
 - Bump image tags (Eric, Andres)

v4:
 - Bump image tags (Eric)

Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Acked-by: Andres Gomez <agomez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8362>
2021-01-23 05:35:24 +00:00
Eric Anholt 9ef2c44ce6 ci: Add a530 and a630 piglit runs.
These bring a whole lot of new coverage to these drivers, since dEQP is
bad at desktop GL feature coverage around early GL 3.x.  piglit also gets
at a lot of MSAA, fast clearing, and texture layout issues that dEQP
doesn't do much with.

Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7370>
2021-01-02 02:43:21 +00:00
Eric Anholt cab48ba9f5 ci/bare-metal: Pass through FDO_CI_CONCURRENT on bare-metal runners.
I've set it up in the gitlab-runer config on all the freedreno boards.
This means that for piglit, where the run.sh always choose either this
variable or 4 threads otherwise, we'll have the right number of parallel
tasks.

Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7370>
2021-01-02 02:43:21 +00:00
Christian Gmeiner 6e093a4493 ci/bare-metal: pass thorugh PIGLIT env vars
Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7370>
2021-01-02 02:43:21 +00:00
Christian Gmeiner b020feb4d1 ci/fastboot: exclude either deqp or piglit
Keep the generated initramfs image as small as possible.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7370>
2021-01-02 02:43:21 +00:00
Christian Gmeiner 12647ac193 ci/bare-metal: build full piglit for baremetal ARM targets.
ARM64 had it for traces only, upgrade it to a full build so we can test
a630.  We also add it for armhf, as we'll want it on both rpi and etnaviv.

Bumped the LAVA tag as well, since the script changes a bit and it does
impact the final image (even if we aren't pulling in full piglit there
yet).  Note I also had to drop the "v" on the tarring of their rootfs, as
the verbosity on baremetal was exceeding job log size.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7370>
2021-01-02 02:43:21 +00:00
Andres Gomez 54bdec63ef ci: add piglit job to baremetal and remove tracie ones
v2:
  - Squashed the commit to remove tracie jobs (Eric).

v3:
  - Rename *-piglit-traces jobs with *-traces (Eric).

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6388>
2020-12-22 18:31:01 +00:00
Andres Gomez 654bfb0012 ci: build piglit inside baremetal and LAVA's rootfs
v2:
  - Updated the ci-fairy commit to use.

Signed-off-by: Andres Gomez <agomez@igalia.com>
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6388>
2020-12-22 18:31:01 +00:00
Eric Anholt 463dbbffa8 ci/deqp: Make sure that we pull in all board-specific xfail/skip/flake files.
When introducing/removing these files, it's easy to forget to update the
yml to point to them.  Instead of requiring the separate update, just have
the runner script pick the right one from a single per-gpu variable.

As a result, we now pick up the new deqp-lvp-skips.txt that was added but
not conected.  This also required moving some bypass flakes from the
shared a630 flakes list to a separate list, which is a feature because now
we'd notice the introduction of flakes to the gmem path.

Fixes: ab79e6b8e3 ("ci: skip failing test on lavapipe")
Acked-by: Daniel Stone <daniels@collabora.com>
Reviewed-by: Juan A. Suarez <jasuarez@igalia.com>
Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8147>
2020-12-21 18:44:43 +00:00
Eric Anholt 5fca7cd8b8 ci/freedreno: Detect the cheza power management bus error and restart.
This is an issue on the cheza platform, the theory is due to some old
firmware bug that will be fixed in future platforms.  Given that cheza was
a target that didn't get released and we expect future platforms to be
fixed, just detect the issue and restart.

I've noticed this error in my CI monitoring less than once a week.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7993>
2020-12-08 23:31:17 +00:00
Juan A. Suarez Romero 72b68bd2a6 ci: add testing for VC4 drivers (Raspberry Pi 3)
This tests OpenGL ES 2.0 CTS suite with VC4 drivers, through baremetal
Raspberry Pi 3 devices.

The devices are connected to a switch that supports Power over Ethernet
(PoE), so the devices can be started/stopped through the switch, and
also to a host that runs the GitLab runner through serial-to-USB cables,
to monitor the devices to know when the testing finishes.

The Raspberries uses a network boot, using NFS and TFTP. For the root
filesystem, they use the one created in the armhf container. For the
kernel/modules case, this is handled externally. Currently it is using
the same kernel/modules that come with the Raspberry Pi OS. In future we
could build them in the same armhf container.

At this moment we only test armhf architecture, as this is the default
one suggested by the Raspberry Pi Foundation. In future we could also
add testing for arm64 architecture.

Finally, for the very rare ocassions where the Raspberry Pi 3 device is
booted but no data is received, it retries the testing for a second
time, powering off and on the device in the process.

v2:
 - Remove commit that exists capture devcoredump (Eric)
 - Squash remaining commits in one (Andres)

v3:
 - Add missing boot timeout check (Juan)

v4:
 - Use locks when running the PoE on/off script (Eric)
 - Use a timeout for serial read (Eric)

v5:
 - Rename stage to "raspberrypi" (Eric)
 - Bump up arm64_test tag (Eric)

v6:
 - Make serial buffer timeout optional (Juan)

Reviewed-by: Eric Anholt <eric@anholt.net>
Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7628>
2020-12-04 19:05:19 +01:00
Eric Anholt 6bc35c00e2 ci/deqp: Allow specifying the caselist fraction separate from CI_NODE_INDEX.
To increase our VK coverage on a630, we want to have two jobs in parallel,
but we still can't hit full coverage so we need the fractional setting to
be separate from gitlab CI's flags for setting up parallel jobs.

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6971>
2020-11-11 17:22:47 +00:00
Eric Anholt 2998a0b055 ci/freedreno: Group the short a630 dEQP runs into one test job.
This saves the minute and a half boot time on each of these minute-or-less
test jobs.  The whole job was 3.5 minutes in my last run.

Acked-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6971>
2020-11-11 17:22:47 +00:00
Eric Anholt bf29daa1b5 ci/deqp: Switch to a new dEQP runner written in Rust.
I found the C++ runner hard to develop on, and we had stability issues and
outstanding feature needs that made me want something I felt good about
hacking on.  Thus, Rewrite It In Rust of the deqp runner.

The new runner includes:

- Skip lists don't reshuffle the test list.
- Known-flake handling without resorting to skip lists (fixing our main CI
  reliability issue on a3xx right now).
- Per-thread Vulkan shader caches should speed up VK CI runtime.
- Tracking of crashes separate from fails (so we can see progress on that
  front).
- Logging of deqp stderr spam (particularly assertion failures!) in the CI
  log.
- Integrated QPA filtering so we don't have bash perf issues for it.
- Logging of what caselist to go look at for a given error report (in red,
  so it's easier to find in your CI log).
- The code is 1/3 unit tests, and easy to extend for more coverage.
- Non-LAVA CI runs create a failures.csv in artifacts that you can check
  in as your deqp-*-fails.txt file.
- Test runtime is included in results.csv so you can debug how to speed up
  your CI job.
- Pretty summary at the end of the run of slow/flaky/failed tests.

Since this is a new runner with a different RNG, the test groups are
shuffled one more time.  This seems to result in some panfrost T720
stability issues (See its new deqp-panfrost-t720-flakes.txt), and one new
flake in freedreno a630.

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7434>
2020-11-06 19:48:39 +00:00
Eric Anholt fe61230b38 ci/bare-metal: Reset colors at the end of a line of serial output.
We don't want the next line of our timestamp and other context to inherit
colors set by the serial command (visible with the new dEQP runner)

Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7434>
2020-11-06 19:48:39 +00:00
Eric Anholt ff6741728d ci/bare-metal: Apply autopep8 to the bare-metal scripts.
Let's follow proper python formatting (easy now that vscode does it for
me)

Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7434>
2020-11-06 19:48:39 +00:00
Christian Gmeiner 915f2919f6 ci/bare-metal: suppress 'No such file or directory'
It fills the serial log with unimportant messages.

Signed-off-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Reviewed-by: Eric Anholt <eric@anholt.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7347>
2020-10-28 18:04:22 +00:00
Eric Anholt e3c7748b2e ci/bare-metal: Move the "POWER_GOOD not seen in time" check to the right time.
The poweron failure happens before we get to the bootloader
("load_archive: loading locale_en.bin") not after we're trying to boot the
kernel and we're waiting for the deqp run to complete.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6970>
2020-10-02 02:41:37 +00:00
Rob Clark aee1c08c06 ci/deqp-runner: Allow overriding width/height/config
This will allow adding multi-sample caselists, and jobs with larger
surface size.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6553>
2020-09-29 19:58:50 +00:00
Eric Anholt 0f61f0142a ci/bare-metal: Allow wget of the kernel/dtb for kernel development.
It's useful for kernel dev to be able throw all of our testing
infrastructure at a risky kernel change, but it's expensive (time and
bandwidth) to roll new containers every time your rev your kernel.  Make
it so you can just point the env vars to your personal build you've
uploaded.

Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
Reviewed-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6592>
2020-09-09 17:25:38 +00:00
Eric Anholt fd2ee49b21 ci/bare-metal: Use python for handling fastboot booting and parsing
Modeling after what I did for cros_servo_run.py, this gives us easy
support for restarting the test run a530 when we detect a spontaneous
reboot.  I had to touch up serial_buffer.py to handle buffering in from a
file instead of a serial device, to support the upcoming etnaviv CI
(tested by running it against a serial log from db410c and seeing it step
to calling "fastboot")

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>
2020-09-03 23:22:44 +00:00
Eric Anholt 0453a46f66 ci/bare-metal: Fix capturing of serial output as job artifacts.
I tried to put them in the wrong directory -- everything needs to go in
results/, which we want clean and ready before we start our job.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>
2020-09-03 23:22:43 +00:00
Eric Anholt 24f5f11719 ci/bare-metal: Log why our run restarts when it does.
It would be confusing to see a job quietly restart itself in the middle.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>
2020-09-03 23:22:43 +00:00
Eric Anholt 785d3cace4 ci/bare-metal: Include a timestamp in our serial reads.
gitlab CI doesn't include timestamps in its logs by default, but it's
really useful for finding delays in our CI so stuff one in on the lines
coming in from serial and being output to the gitlab log.  The artifacts
file is still the raw serial output.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>
2020-09-03 23:22:42 +00:00
Eric Anholt ff42b7e804 ci/bare-metal: Fix detection of "POWER_GOOD not seen in time" fails
We were only reading from the CPU serial, not EC, so we'd never notice
these sources of job timeouts.  I couldn't find a cleaner solution, so I
spawned two threads to do the blocking reads from our serial line fifos
and merge them together in a single queue to read.

Closes: #3470
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>
2020-09-03 23:22:41 +00:00
Eric Anholt b7787ce18d ci/bare-metal: Use re.search() instead re.match() for our line matching.
match() looks for the start of the line to match our regex, while search
just looks for the regex anywhere in the line.  I messed this up when
converting our greps in shell to python, which was part of breaking the
POWER_GOOD flake detection.  Most of our matches worked, but let's
consistently use this one so we don't mess this up in the future.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>
2020-09-03 23:22:40 +00:00
Eric Anholt 2da1178bf3 ci/bare-metal: Try rebooting chezas again if they get stuck during tftp.
Occasionally something goes weird in the network and a group of chezas
will produce streams of these errors during the tftp process, eventually
timing out after 60 minutes in the job.  By the time we notice, the next
jobs seem to go through fine, so watch for them and try rebooting the
cheza to see if that gets our jobs to pass again.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6398>
2020-08-21 20:10:18 +00:00
Eric Anholt c27075e9e1 ci/bare-metal: Retry booting chezas instead of failing when !POWER_GOOD
If we get this error, we can just try rebooting again and see if it comes
up then.  The POWER_GOOD failures are clustered in time, but it's better
to retry a few times in a row in one job (which has its own 60min timeout)
than to spuriously fail someone's pipeline.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6398>
2020-08-21 20:10:18 +00:00
Eric Anholt c63648121e ci/bare-metal: Convert the main cros-servo boot code to python
Switching this part to python makes the code clearer and cleans up our
logs as well.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6398>
2020-08-21 20:10:18 +00:00