Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set up monitoring and alerting for static CDNs #179

Open
6 tasks
jdno opened this issue Nov 18, 2024 · 0 comments
Open
6 tasks

Set up monitoring and alerting for static CDNs #179

jdno opened this issue Nov 18, 2024 · 0 comments
Assignees

Comments

@jdno
Copy link
Member

jdno commented Nov 18, 2024

Over the past two years, we implemented a few fundamental changes to our static distributions static.crates.io and static.rust-lang.org:

  1. Fastly has been added a second Content Delivery Network and is now serving 95% of the traffic for the two distributions.
  2. Downloads of crates are now counted using CDN logs, which enabled us to bypass the API of crates.io and download crates straight from the CDN.
  3. Most of our monitoring has been migrated to Datadog.

Between these changes, we have gained a few new capabilities:

  • We can use the two CDNs (AWS CloudFront and Fastly) to implement a fail-over in case any one of them experiences a service disruption.
  • We can set up monitoring and alerting across both CDNs in Datadog, which has full visibility into both networks.

Before moving the crate downloads to the CDN, the application monitoring of crates.io signaled when there were any issues. Since moving the downloads from the API to the CDN, these checks are less relevant and will most likely miss a service disruption in the CDNs.

Taking all of this into account, we want to set up better monitoring and alerting for the two static distributions. Monitoring and proactive health checks will enable us to implement an automatic fail-over, while alerting will notify the team that a CDN is experiencing issues.

Tasks

  • Set up the automatic fail-over for the CDNs
  • Set up alerting on Datadog (that pings @jdno)
    • Create monitors for CDNs
    • Create monitor for traffic distribution ratio
@jdno jdno self-assigned this Nov 18, 2024
@jdno jdno added this to infra-team Nov 18, 2024
@github-project-automation github-project-automation bot moved this to Backlog in infra-team Nov 18, 2024
@jdno jdno moved this from Backlog to Ready in infra-team Nov 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Ready
Development

No branches or pull requests

1 participant