Skip to content

Commit

Permalink
CASMINST-7168 adjust velero_backups_check.sh to ignore failed
Browse files Browse the repository at this point in the history
backups if a newer successful one exists

during nightly CT runs, velero backups for vault have failed, but
newer successful backups existed.  the original issue was transient
and due to a timeout error, per CASMTRIAGE-7762, but the ct tests
continued to generate auto-triage tickets since a failure was detected.

by extending this script to check for the existence of a newer
successful backup and not failing in that case, we will prevent
additional, irrelevant auto-triage tickets from being generated.

Signed-off-by: Jacob Salmela <[email protected]>
  • Loading branch information
jacobsalmela committed Feb 12, 2025
1 parent 3ef5a07 commit 19ab88f
Showing 1 changed file with 16 additions and 0 deletions.
16 changes: 16 additions & 0 deletions goss-testing/scripts/velero_backups_check.sh
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,22 @@ number_failed=$(jq '[.items[] | select(.status.phase == "PartiallyFailed") | sel
# if the number of failed backups is not 0, print the failed backups and exit with a non-zero status
if [[ $number_failed -ne 0 ]];
then
completed_backups=$(velero backup get -o json | jq -r '[.items[] | select(.status.phase == "Completed") | select(.metadata.name | contains("vault"))] | .[] | .metadata.creationTimestamp')
failed_backups=$(jq -r '[.items[] | select(.status.phase == "PartiallyFailed") | select(.metadata.name | contains("vault"))] | .[] | .metadata.creationTimestamp' < "$failed_backups_path")
# check if there is a newer backup that was successful
for failed_backup in $failed_backups; do
for completed_backup in $completed_backups; do
if [[ $failed_backup < $completed_backup ]];
then
echo "Backup $failed_backup failed but there is a newer successful backup: $completed_backup"
newer_successful_backup=true
continue 2
fi
done
done
if [[ "$newer_successful_backup" == "true" ]]; then
echo "PASS"; exit 0;
fi
echo "Investigate remaining Failed or PartiallyFailed backups: $(kubectl get backups -A -o json | jq -e '.items[] | select(.status.phase == "PartiallyFailed") | .metadata.name')"
echo "FAIL"; exit 1;
else
Expand Down

0 comments on commit 19ab88f

Please sign in to comment.