mesa/.gitlab-ci/download-git-cache.sh

#!/bin/bash

set +e
set -o xtrace

# if we run this script outside of gitlab-ci for testing, ensure
# we got meaningful variables
CI_PROJECT_DIR=${CI_PROJECT_DIR:-$(mktemp -d)/mesa}

if [[ -e $CI_PROJECT_DIR/.git ]]
then
    echo "Repository already present, skip cache download"
    exit
fi

TMP_DIR=$(mktemp -d)

echo "Downloading archived master..."
/usr/bin/wget -O $TMP_DIR/mesa.tar.gz \
              https://${MINIO_HOST}/git-cache/${FDO_UPSTREAM_REPO}/mesa.tar.gz

# check wget error code
if [[ $? -ne 0 ]]
then
    echo "Repository cache not available"
    exit
fi

set -e

rm -rf "$CI_PROJECT_DIR"
echo "Extracting tarball into '$CI_PROJECT_DIR'..."
mkdir -p "$CI_PROJECT_DIR"
tar xzf "$TMP_DIR/mesa.tar.gz" -C "$CI_PROJECT_DIR"
rm -rf "$TMP_DIR"
chmod a+w "$CI_PROJECT_DIR"
CI: reduce bandwidth for git pull Over the last 7 days, git pulls represented a total of 1.7 TB. On those 1.7 TB, we can see: - ~300 GB for the CI farm on hetzner - ~730 GB for the CI farm on packet.net - ~680 GB for the rest of the world We can not really change the rest of the world, but we can certainly reduce the egress costs towards our CI farms. Right now, the gitlab runners are not doing a good job at caching the git trees for the various jobs we make, and we end up with a lot of cache-misses. A typical pipeline ends up with a good 2.8GB of git pull data. (a compressed archive of the mesa folder accounts for 280MB) In this patch, we implemented what was suggested in https://gitlab.com/gitlab-org/gitlab/-/issues/215591#note_334642576 - we host a brand new MinIO server on packet - jobs can upload files on 2 locations: git-cache/<namespace>/<project>/<branch-name>.tar.gz * artifacts/<namespace>/<project>/<pipeline-id>/ - the authorization is handled by gitlab with short tokens valid only for the time of the job is running - whenever a job runs, the runner are configured to execute (eval) $CI_PRE_CLONE_SCRIPT - this variable is set globally to download the current cache from the MinIO packet server, unpack it and replace the possibly out of date cache found on the runner - then git fetch is run by the runner, and only the delta between the upstream tree and the local tree gets pulled. We can rebuild the git cache in a schedule job (once a day seems sufficient), and then we can stop the cache miss entirely. First results showed that instead of pulling 280MB of data in my fork, I got a pull of only 250KB. That should help us. * arguably, there are other farms in the rest of the world, so hopefully we can change those too. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5428> 2020-06-11 16:16:28 +01:00			`#!/bin/bash`

			`set +e`
			`set -o xtrace`

			`# if we run this script outside of gitlab-ci for testing, ensure`
			`# we got meaningful variables`
			`CI_PROJECT_DIR=${CI_PROJECT_DIR:-$(mktemp -d)/mesa}`

			`if [[ -e $CI_PROJECT_DIR/.git ]]`
			`then`
			`echo "Repository already present, skip cache download"`
			`exit`
			`fi`

			`TMP_DIR=$(mktemp -d)`

			`echo "Downloading archived master..."`
			`/usr/bin/wget -O $TMP_DIR/mesa.tar.gz \`
ci: specify MinIO's host URL in a global variable Signed-off-by: Andres Gomez <agomez@igalia.com> Reviewed-by: Tomeu Vizoso <tomeu.vizoso@collabora.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/6388> 2020-08-23 21:32:40 +01:00			`https://${MINIO_HOST}/git-cache/${FDO_UPSTREAM_REPO}/mesa.tar.gz`
CI: reduce bandwidth for git pull Over the last 7 days, git pulls represented a total of 1.7 TB. On those 1.7 TB, we can see: - ~300 GB for the CI farm on hetzner - ~730 GB for the CI farm on packet.net - ~680 GB for the rest of the world We can not really change the rest of the world, but we can certainly reduce the egress costs towards our CI farms. Right now, the gitlab runners are not doing a good job at caching the git trees for the various jobs we make, and we end up with a lot of cache-misses. A typical pipeline ends up with a good 2.8GB of git pull data. (a compressed archive of the mesa folder accounts for 280MB) In this patch, we implemented what was suggested in https://gitlab.com/gitlab-org/gitlab/-/issues/215591#note_334642576 - we host a brand new MinIO server on packet - jobs can upload files on 2 locations: git-cache/<namespace>/<project>/<branch-name>.tar.gz * artifacts/<namespace>/<project>/<pipeline-id>/ - the authorization is handled by gitlab with short tokens valid only for the time of the job is running - whenever a job runs, the runner are configured to execute (eval) $CI_PRE_CLONE_SCRIPT - this variable is set globally to download the current cache from the MinIO packet server, unpack it and replace the possibly out of date cache found on the runner - then git fetch is run by the runner, and only the delta between the upstream tree and the local tree gets pulled. We can rebuild the git cache in a schedule job (once a day seems sufficient), and then we can stop the cache miss entirely. First results showed that instead of pulling 280MB of data in my fork, I got a pull of only 250KB. That should help us. * arguably, there are other farms in the rest of the world, so hopefully we can change those too. Reviewed-by: Michel Dänzer <mdaenzer@redhat.com> Reviewed-by: Peter Hutterer <peter.hutterer@who-t.net> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@gmail.com> Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/5428> 2020-06-11 16:16:28 +01:00
			`# check wget error code`
			`if [[ $? -ne 0 ]]`
			`then`
			`echo "Repository cache not available"`
			`exit`
			`fi`

			`set -e`

			`rm -rf "$CI_PROJECT_DIR"`
			`echo "Extracting tarball into '$CI_PROJECT_DIR'..."`
			`mkdir -p "$CI_PROJECT_DIR"`
			`tar xzf "$TMP_DIR/mesa.tar.gz" -C "$CI_PROJECT_DIR"`
			`rm -rf "$TMP_DIR"`
			`chmod a+w "$CI_PROJECT_DIR"`