Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discovery: CI monitoring services #168

Open
jmbowman opened this issue Nov 29, 2022 · 2 comments
Open

Discovery: CI monitoring services #168

jmbowman opened this issue Nov 29, 2022 · 2 comments

Comments

@jmbowman
Copy link

jmbowman commented Nov 29, 2022

While moving edx-platform tests from Jenkins to GitHub Actions, it has become clear that GitHub Actions doesn’t have a great dashboard/monitoring solution. I started cobbling together some ad hoc API queries to collect performance statistics for presentations, but it looks like there are services that already do this and much more. I'd like for us to do a quick survey of solutions in this space and see if any of them are worth the cost. This is something that tCRIL would likely need to run for repositories in the openedx GitHub organization, and 2U would need to run for ones in the edx organization (although further discovery is needed to confirm that's the best approach).

Candidates:

Backstage does have a GitHub Actions integration plugin, but at a glance it doesn’t seem to get you much more than you’d get from looking at a repo’s Actions tab; see the screenshot at https://roadie.io/backstage/plugins/github-actions/ . That’s handy when trying to find all your resources related to a repo from one place, but we’d still need something to do more detailed analysis of CI performance. (And probably link to that from Backstage, also.)

Things that would be nice to get from a service like this:

  • How long does it take to finish running all the checks for a commit in each repo?
  • Which workflows, jobs, and individual steps consume the most time?
  • What does the distribution of durations look like? Are there some runs that take much longer than the average?
  • How often does any check on a commit fail? Which ones fail the most?
  • Which checks are running in each repo? Could help us standardize a bit.

Important questions:

  • How much does the service cost?
  • If per-seat, can we choose fewer seats than the org has to limit cost?

2U's repo health dashboard has a dump of JSON data for checks on the latest master commit for each repo, but it’s pretty hard to visualize and interpret (and lacks historical data). We could build something ourselves via the GitHub Actions API, but it would probably not be as economical or nice as something we can buy off the shelf.

(Copied and updated from https://openedx.atlassian.net/browse/ARCHBOM-2043 , which is now inaccessible to the public.)

@jmbowman
Copy link
Author

Thundra got acquired, mainly for stuff other than the Foresight product: https://www.catchpoint.com/press-releases/catchpoint-invests-to-advance-api-cloud-functions-and-microservices-monitoring . Everything related to Foresight is now offline, so that's no longer an option for the foreseeable future. Updating the description accordingly, for future reference this was its entry in the list of candidates:

@jmbowman
Copy link
Author

DataDog has https://www.datadoghq.com/product/ci-cd-monitoring/ , but the pricing starts at $8 per committer per month. Probably a non-starter for Open edX repos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

1 participant