mirrors/mesa - Frog Git

Commit Graph

Author	SHA1	Message	Date
Emma Anholt	42a52a8be1	ci/bare-metal: Re-open serial and everything after test phase timeout. If we got a "Reached the end of the CPU serial log without finding a result" because the test phase timed out, then the CPU serial would have been closed as part of the timeout process, so we need to close the rest and re-instantiate the servo run class. fastboot and poe already re-instantiate the class on retry. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17689>	2022-08-04 02:48:26 +00:00
Emma Anholt	5f09b1ebe9	ci/bare-metal: Add test phase timeouts to all boards. This should help with "marge got stuck for an hour and all I got was this failed job with no results/" when a system intermittently wedges. This replaces the BM_POE_TIMEOUT ("did we get something on serial in the last 3 minutes?") that rpi had, in favor of checking that the whole test job gets through in 20 minutes. Acked-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>	2022-06-21 21:38:25 +00:00
Emma Anholt	cd3d9a7a92	ci/bare-metal: Add handling of netboot firmwares for servo boards. My local trogdor has a netboot firmware and I want to be able to use it to test the timeout code I'm working on. Acked-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>	2022-06-21 21:38:25 +00:00
Emma Anholt	3f8114d1e0	ci/bare-metal: Get rid of servo's serial feed threads. If the SerialBuffers can just feed the same line queue, then we don't need the extra threads reading line queues into a new merged line queue. Less python threading code is always better. Plus, now we can pass args to SerialBuffer.lines() for timeout/phase. Acked-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>	2022-06-21 21:38:25 +00:00
Emma Anholt	1e15ec1949	ci/bare-metal: Apply autopep8 to our python scripts. My editor likes to pep8 as I edit, and I'm tired of carefully not committing those changes. Acked-by: Juan A. Suarez <jasuarez@igalia.com> Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/17096>	2022-06-21 21:38:25 +00:00
Emma Anholt	d633eace3f	ci/freedreno: Try to detect a wedged MMU that's happened recently. Possibly since the VK-GL-CTS 1.3.1.0 uprev. It doesn't seem to recover, like it says. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/14945>	2022-02-10 01:13:31 +00:00
Emma Anholt	8f5a0bd9b4	ci/bare-metal: Close serial and join serial threads before exit. This should fix the intermittent (~1/week) cheza failure where python complains that a thread tried to do stdio while the main thread has exited. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13462>	2021-11-10 20:36:57 +00:00
Emma Anholt	b86da01c54	ci/freedreno: Restart the run if cheza spontenously reboots. Occasionally (once every couple weeks?) a cheza reboots mid run, around a GPU fault. Detect that and do an internal retry instead of failing out the job. Closes: #5388 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/13181>	2021-10-04 22:15:27 +00:00
Emma Anholt	306a039472	ci/baremetal: Retry if our network device spontaneously fails. Seen in https://gitlab.freedesktop.org/mesa/mesa/-/jobs/13824132. It's unlikely that graphics would kill the network, so just assume it's not our fault and keep going. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/12939>	2021-09-20 19:55:55 +00:00
Daniel Stone	5f32d2a438	ci: Consistent pass/fail result output One less point of differentiation. Signed-off-by: Daniel Stone <daniels@collabora.com> Acked-by: Martin Peres <martin.peres@mupuf.org> Acked-by: Emma Anholt <emma@anholt.net> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11337>	2021-06-15 14:02:44 +02:00
Eric Anholt	1af7be02d7	ci/bare-metal: Move the db820c lockup detect to the right boot script. Fixes: `2407952ec9` ("ci/bare-metal: Restart a run on intermittent kernel lockups.") Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9715>	2021-03-19 22:07:57 +00:00
Eric Anholt	2407952ec9	ci/bare-metal: Restart a run on intermittent kernel lockups. Since enabling SMP on db820c and cranking up how many tests we run, we've been seeing lockups like this a couple of times a week. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9655>	2021-03-17 17:13:22 +00:00
Juan A. Suarez Romero	e45d372968	ci/baremetal: highlight message errors Highlight in red errors from the baremetal run, so user is more aware of what happened. Signed-off-by: Juan A. Suarez Romero <jasuarez@igalia.com> Reviewed-by: Eric Anholt <eric@anholt.net> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/9335>	2021-03-01 18:22:24 +00:00
Eric Anholt	ce1bb26b06	ci/freedreno: Detect cheza HFI errors and restart the run. These are intermittent (~1/day), seem to be around GPU faults (so hopefully will go away once we clean up piglit's fault errors), and are probably also related to our vintage firmware. Until we can get new hardware in the farm, just restart the flaked job. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8722>	2021-01-26 19:17:27 +00:00
Eric Anholt	5fca7cd8b8	ci/freedreno: Detect the cheza power management bus error and restart. This is an issue on the cheza platform, the theory is due to some old firmware bug that will be fixed in future platforms. Given that cheza was a target that didn't get released and we expect future platforms to be fixed, just detect the issue and restart. I've noticed this error in my CI monitoring less than once a week. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7993>	2020-12-08 23:31:17 +00:00
Eric Anholt	ff6741728d	ci/bare-metal: Apply autopep8 to the bare-metal scripts. Let's follow proper python formatting (easy now that vscode does it for me) Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/7434>	2020-11-06 19:48:39 +00:00
Eric Anholt	e3c7748b2e	ci/bare-metal: Move the "POWER_GOOD not seen in time" check to the right time. The poweron failure happens before we get to the bootloader ("load_archive: loading locale_en.bin") not after we're trying to boot the kernel and we're waiting for the deqp run to complete. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6970>	2020-10-02 02:41:37 +00:00
Eric Anholt	0453a46f66	ci/bare-metal: Fix capturing of serial output as job artifacts. I tried to put them in the wrong directory -- everything needs to go in results/, which we want clean and ready before we start our job. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>	2020-09-03 23:22:43 +00:00
Eric Anholt	24f5f11719	ci/bare-metal: Log why our run restarts when it does. It would be confusing to see a job quietly restart itself in the middle. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>	2020-09-03 23:22:43 +00:00
Eric Anholt	ff42b7e804	ci/bare-metal: Fix detection of "POWER_GOOD not seen in time" fails We were only reading from the CPU serial, not EC, so we'd never notice these sources of job timeouts. I couldn't find a cleaner solution, so I spawned two threads to do the blocking reads from our serial line fifos and merge them together in a single queue to read. Closes: #3470 Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>	2020-09-03 23:22:41 +00:00
Eric Anholt	b7787ce18d	ci/bare-metal: Use re.search() instead re.match() for our line matching. match() looks for the start of the line to match our regex, while search just looks for the regex anywhere in the line. I messed this up when converting our greps in shell to python, which was part of breaking the POWER_GOOD flake detection. Most of our matches worked, but let's consistently use this one so we don't mess this up in the future. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6529>	2020-09-03 23:22:40 +00:00
Eric Anholt	2da1178bf3	ci/bare-metal: Try rebooting chezas again if they get stuck during tftp. Occasionally something goes weird in the network and a group of chezas will produce streams of these errors during the tftp process, eventually timing out after 60 minutes in the job. By the time we notice, the next jobs seem to go through fine, so watch for them and try rebooting the cheza to see if that gets our jobs to pass again. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6398>	2020-08-21 20:10:18 +00:00
Eric Anholt	c27075e9e1	ci/bare-metal: Retry booting chezas instead of failing when !POWER_GOOD If we get this error, we can just try rebooting again and see if it comes up then. The POWER_GOOD failures are clustered in time, but it's better to retry a few times in a row in one job (which has its own 60min timeout) than to spuriously fail someone's pipeline. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6398>	2020-08-21 20:10:18 +00:00
Eric Anholt	c63648121e	ci/bare-metal: Convert the main cros-servo boot code to python Switching this part to python makes the code clearer and cleans up our logs as well. Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6398>	2020-08-21 20:10:18 +00:00

24 Commits