Skip to content

Set up monitoring and alerting for static CDNs #179

Open
@jdno

Description

@jdno

Over the past two years, we implemented a few fundamental changes to our static distributions static.crates.io and static.rust-lang.org:

  1. Fastly has been added a second Content Delivery Network and is now serving 95% of the traffic for the two distributions.
  2. Downloads of crates are now counted using CDN logs, which enabled us to bypass the API of crates.io and download crates straight from the CDN.
  3. Most of our monitoring has been migrated to Datadog.

Between these changes, we have gained a few new capabilities:

  • We can use the two CDNs (AWS CloudFront and Fastly) to implement a fail-over in case any one of them experiences a service disruption.
  • We can set up monitoring and alerting across both CDNs in Datadog, which has full visibility into both networks.

Before moving the crate downloads to the CDN, the application monitoring of crates.io signaled when there were any issues. Since moving the downloads from the API to the CDN, these checks are less relevant and will most likely miss a service disruption in the CDNs.

Taking all of this into account, we want to set up better monitoring and alerting for the two static distributions. Monitoring and proactive health checks will enable us to implement an automatic fail-over, while alerting will notify the team that a CDN is experiencing issues.

Tasks

  • Set up the automatic fail-over for the CDNs
  • Set up alerting on Datadog (that pings @jdno)
    • Create monitors for CDNs
    • Create monitor for traffic distribution ratio

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

Status

Ready

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions