Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(link-checker): Simplify link-checker workflow configuration #3299

Merged
merged 2 commits into from
Aug 12, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 22 additions & 60 deletions .github/workflows/link-checker.yml
Original file line number Diff line number Diff line change
@@ -1,66 +1,28 @@
name: Link Checker
name: Link checker

on:
workflow_dispatch:
schedule:
- cron: "2 0 * * *"

# START Temporary for testing.
pull_request:
branches: [main]
push:
branches: ["link-checker-workflow-configuration"]
# END Temporary for testing.

defaults:
run:
# Specify to ensure "pipefail and errexit" are set.
# Ref: https://docs.github.com/en/actions/writing-workflows/workflow-syntax-for-github-actions#defaultsrunshell
shell: bash
branches:
- master
pull_request:
branches:
- master
schedule:
- cron: "00 3 * * 1-5"

jobs:
link-checker-documentation:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

- name: Setup Ruby
uses: ruby/setup-ruby@v1
with:
ruby-version: 3.1
bundler-cache: true

- name: check links
env:
LANG: "C.UTF-8"
run: |
bundle exec jekyll build
#
# Remove the redirect-files before link-check
find _site/en _site/documentation -name \*.html | \
xargs grep -l "Click here if you are not redirected." | xargs rm
#
# htmlproofer does not check links inside <code>-elements
find _site -name \*.html | xargs sed -i.orig 's/<code[^>]*>//g; s/<\/code>//g; s/<pre[^>]*>//g; s/<\/pre>//g;'
find _site -name \*.orig | xargs rm
#
bundle exec htmlproofer \
--assume-extension .html \
--no-enforce-https \
--no-check-external-hash \
--allow-missing-href \
--ignore-files '/playground/index.html/' \
--ignore-urls '\
/localhost:8080/,\
/docs.vespa.ai/playground/,\
/javadoc.io.*#/,\
/readthedocs.io.*#/,\
/linux.die.net/,\
/arxiv.org/,\
/hub.docker.com/r/,\
/platform.openai.com/' \
--typhoeus '{"connecttimeout": 10, "timeout": 30, "accept_encoding": "zstd,br,gzip,deflate"}' \
--hydra '{"max_concurrency": 1}' \
--swap-urls '(https\://github.com.*/master/.*)#.*:\1,(https\://github.com.*/main/.*)#.*:\1' \
_site
test:
uses: vespa-engine/gh-actions-workflows/.github/workflows/jekyll-link-checker.yml@v1
with:
ignore-files: |-
/playground/index.html/
ignore-urls: |-
/localhost:8080/
/docs.vespa.ai/playground/
/javadoc.io.*#/
/readthedocs.io.*#/
/linux.die.net/
/arxiv.org/
/hub.docker.com/r/
/platform.openai.com/