Commit Graph

25 Commits

Author SHA1 Message Date
Eric Anholt c189d385ce ci: Bump deqp to current vulkan-cts-1.2.4
I want the new version to show the fix in the fd-largeconsts branch (and
make sure the pass keeps working, and make sure other drivers get around
to fixing the issue).  While I'm here, cherry-pick in the VK test along
with the GLES one, and also the fix for clip_three on ARMs.

Since the VK and GL test lists were changing, I took the opportunity to
reset freedreno xfails lists to just the tests that are being run with the
CTS uprev, and increase its coverage to 1/10th of the CTS across two
boards (since we just freed up a bunch of runtime with the grouped gles
"other" job).

For panfrost, I didn't spend the time characterizing the t720 fragment_ops
flakes like I did for the deqp-runner change.  Given that the random
behavior changes between CTS versions, it doesn't seem to be worth the
time to do so.

Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6971>
2020-11-11 17:22:47 +00:00
Eric Anholt c19b7fc024 ci/freedreno: Move our skips lists over to being known-flakes lists.
This makes sure that we keep executing the tests so that we can get our
alerts in IRC and know whether the tests are still flaking.  It also keeps
us from having adjustments to the skip list causing failures/flakes to
move to different tests (as seen with a530 having to move some xfails
around after changing the skip list)

Reviewed-by: Christian Gmeiner <christian.gmeiner@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6392>
2020-08-20 23:59:50 +00:00
Rob Clark d35b54c705 freedreno: sync registers from envytools
Pull in a bunch of fixes and updates.. mostly using varset correctly,
and fixes for implicit bools.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6052>
2020-07-23 17:11:16 -07:00
Rob Clark ade7c3338a ci: remove some freedreno a6xx skips
These don't seem to be flakey anymore.  I did still see a flake with
dEQP-GLES31.functional.layout_binding.ssbo.fragment_binding_array so
I put that one back in.

Signed-off-by: Rob Clark <robdclark@chromium.org>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5577>
2020-06-23 10:01:58 +00:00
Eric Anholt 6ee80d8e0c ci: Bump vulkan CTS to 1.2.3.0.
Looks like it fixes some potentially important VK test bugs.  But also, it
fixes the GLES31 SSBO layout tests to not be so excessively large, so we
can run them in a reasonable time now.  Note that a630 fail list is reset,
since the test list has changed and so we end up with a different subset
of tests being run.  Interestingly, in the process the semaphore tests are
now reporting "NotSupported (Exporting and importing semaphore type not
supported at vktSynchronizationSignalOrderTests.cpp:513)" where they
weren't before.

Reviewed-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5554>
2020-06-19 14:50:05 -07:00
Jonathan Marek c95b250a4c turnip: set the API version
Some CTS tests don't run because of this.

Fixes: 91c757b796 ("turnip: use the common code for generating extensions and dispatch tables")

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5522>
2020-06-18 09:54:48 +00:00
Jonathan Marek 1622787ee4 turnip: set VFD_INDEX_OFFSET in 3D clear/blit path
This was missing an causing flakes when used after a test that set it to
a non-zero value.

Signed-off-by: Jonathan Marek <jonathan@marek.ca>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5509>
2020-06-17 08:50:42 +00:00
Eric Anholt dd938356c7 ci: Disable some flaky tests on turnip.
These have appeared more than once in the flake reporting channel, and a
couple of them have spuriously failed marge-bot merges.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5429>
2020-06-12 18:39:58 +00:00
Eric Anholt 92afe94d28 freedreno: Work around UBWC flakiness.
In trying to track down the new failure in #2670, I found that I could get
the flaky test set down to 4 tests, and dropping any remaining test
wouldn't trigger the failure (a bad 8x4 block in the middle of
dEQP-GLES3.functional.fbo.msaa.4_samples.r16f's render target).  Disabling
gmem or bypass didn't help, and adding lots of CCU flushing didn't help.
What did help was disabling blitting, or this memset to initialize the
UBWC area after we (presumably) pull a BO out of the BO cache.  My guess
is that the 2D blitter can't handle some rare set of state in the flags
buffer and emits some garbage.

I've run 8 gles3 and 7 gles31 runs with this branch now so hopefully I've got the4 right set of flakes marked for removal.

Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2670
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4290>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4290>
2020-03-30 21:48:59 +00:00
Eric Anholt 41412cc4b7 ci: Ban the recent popular freedreno a630 intermittent failure.
This popped up last thursday.  The only relevant code commit was my pixel
center half integer change, but the more likely thing to me seems to be
having shuffled the test order by introducing more skips the day before.

Link: https://gitlab.freedesktop.org/mesa/mesa/issues/2670
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4287>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4287>
2020-03-23 20:22:53 +00:00
Eric Anholt 116a3ac481 ci: Ban the recent popular freedreno a630 flakes.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4231>
2020-03-18 22:17:53 +00:00
Eric Anholt a91067d3f5 ci: Blacklist another freedreno flaky test.
This is the recurring flake from the last week, including spuriously
failing a pipeline once.

Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3937>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3937>
2020-02-25 01:07:14 +00:00
Eric Anholt 1427f666dc ci: Extend the a630 flake list to reduce spurious failures.
These are the tests I've seen flake twice while logged in to the IRC
channel this year.  Also include fragment_out.random.5 which fully
spuriously failed recently.

Closes: #2516
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3862>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3862>
2020-02-18 22:40:33 +00:00
Eric Anholt 658eb691fc ci: Bump the GLES CTS version to 3.2.6.1.
This brings in the surfaceless fixes so we don't need to check out the
whole repo to cherry pick any more (which was bothering me as I debugged
things late in the painfully slow ARM container build process).

Reviewed-by: Michel Dänzer <mdaenzer@redhat.com>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3662>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3662>
2020-02-06 15:18:24 -08:00
Eric Anholt b37922dd1e ci: Disable a bunch of tests on freedreno a630.
On a daily basis I've been having to restart people's a630 jobs in the
front couple of pages of /merge_requests due to spurious failures from our
flaky tests, and fielding reports of spurious fails from other developers,
and babysitting my own marge merges that are failing due to our flakes.

Nobody should have to deal with that, especially not non-freedreno
developers, so just scrape the list of flakes reported to #freedreno-ci
for the last month and ban those tests that have failed more than once
until we have a credible fix.

Acked-by: Kristian H. Kristensen <hoegsberg@google.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/merge_requests/3662>
2020-02-06 15:18:15 -08:00
Rob Clark 215866523b gitlab-ci/freedreno/a6xx: remove most of the flakes
xfb + lines/points still flakes too frequently (and the problem isn't
even related to xfb), but we can add the rest back into this mix now.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-11-22 13:48:29 -08:00
Eric Anholt f0eeb98c6c ci: Expand the freedreno blit skip regex to cover more cases.
We've had flaps on at least:
- r16f_to_r16f
- r16i_to_rg16i

Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-11-13 10:58:52 -08:00
Eric Anholt fd777d2cea ci: Disable flappy blit tests on a630.
These have shown up with the new CTS runner, which has changed test
ordering.

Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-11-12 16:43:04 -08:00
Eric Anholt f08c810028 ci: Use cts_runner for our dEQP runs.
This runner is a little project by Bas, written in C++, that spawns
threads that then loop grabbing chunks of the (randomly shuffled but
consistently so) test list and hand it to a dEQP instance.  As the
remaining list gets shorter, so do the chunks, so hopefully the
threads all complete effectively at once.  It also handles restarting
after crashes automatically.  I've extended the runner a bit to do
what I was doing in the bash scripts before, like the skip list and
expected failures handling.  This project should also be a good
baseline for extending to handle retesting of intermittent failures.

By switching to it, we can have the swrast tests just take up one job
slot on the shared runners and keep their allotment of CPUs busy,
instead of taking up job slots with single-threaded dEQP jobs.  It
will also let us (eventually, once I reprovision) switch the freedreno
runners over to threading within the job instead of running concurrent
jobs, so that memory scribbles in one pipeline don't affect unrelated
pipelines, and I can experiment with their parallelism (particularly
on a306 where we are frequently backed up) without trashing other
people's jobs.

What we lose in this process is per-test output in the log (not a big
loss, I think, since we summarize fails at the end and reducing log
length keeps chrome from choking on our logs so badly).  We also drop
the renderer sanity checking, since it's not saving qpa files for us
to go poke through.  Given that all the drivers involved have fail
lists, if we got the wrong renderer somehow, we'd get a job failure
anyway.

v2: Rebase on droppong of the autoscale cluster and the arm64
    build/test split.  Use a script to deduplicate the cts-runner
    build.
v3: Rebase on the amd64 build/test container split.

Acked-by: Daniel Stone <daniels@collabora.com> (v1)
Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> (v2)
2019-11-12 12:54:04 -08:00
Eric Anholt 7f52df7fc9 ci: Make the skip list regexes match the full test name.
The bash scripts were using grep in the manner that matches any subset
of the line, but the new CTS runner matches the whole line and I think
that's a pretty good behavior.  Given that some of the skip lists
already were written to match the full test name, just make them
consistently do so.

Reviewed-by: Eric Engestrom <eric.engestrom@intel.com>
Acked-by: Daniel Stone <daniels@collabora.com>
Acked-by: Michel Dänzer <mdaenzer@redhat.com>
2019-11-12 12:54:04 -08:00
Kristian H. Kristensen d3945e3b9b freedreno/ci: Add failing tests to skip list
Some queries are still failing and layered rending needs more work.

Signed-off-by: Kristian H. Kristensen <hoegsberg@google.com>
2019-10-17 13:45:03 -07:00
Eric Anholt 628ed1bbd5 freedreno/ci: Ban texsubimage2d_pbo.r16ui_2d, due to two flakes reported.
Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl>
Reviewed-by: Daniel Stone <daniels@collabora.com>
2019-10-17 20:32:46 +00:00
Rob Clark 53a38e3015 gitlab-ci/a630: skip dEQP-GLES3.functional.fbo.msaa.2_samples.stencil_index8
Seen a couple flakes on this one so far.  Not sure if it is a real
driver problem or not, but skip it to unblock things.

Signed-off-by: Rob Clark <robdclark@chromium.org>
2019-09-14 10:22:55 -07:00
Eric Anholt 89e840ec59 gitlab-ci/a630: Disable flappy layout_binding.ssbo.fragment_binding_array
It started showing up as unreliable post-merge.  There's a valgrind
complaint, but even fixing that doesn't make it stable.
2019-09-12 14:16:21 -07:00
Eric Anholt 6f0dc087b7 freedreno: Introduce gitlab-based CI.
Since freedreno's kernel and GPU reset seem to be totally solid, we
don't need to have the complexity of the LAVA setup that panfrost has.
Instead, we can register some boards as shared gitlab runners and have
the jobs run out of a docker container just like we do for llvmpipe.
Just make sure that the DRI device node is passed through to the
containers in the gitlab config ('devices = ["/dev/dri"]' under
runners.docker).

If a runner fails (networking dies, kernel panic, etc.) it'll take out
one build but the rest can keep going since gitlab-runner is what
pulls jobs.  Since the runner pulls jobs, it also means that they can
live behind firewalls instead of needing some public address to be
accessed by gitlab.fd.o.

For now, enable it just on db410c (A307) and cheza (A630) as those are
the hardware that I have plenty of.  A307 is only testing GLES2 since
running all of GLES3 takes too long for the number of boards I've
brought up.

Acked-by: Rob Clark <robdclark@chromium.org>
Acked-by: Kenneth Graunke <kenneth@whitecape.org>
2019-09-12 10:55:42 -07:00