-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[RELENG-7422] 📝 Add documentation (#42)
- Loading branch information
gaspardmoindrot
authored
Jun 27, 2023
1 parent
a95e58d
commit 44ba76b
Showing
6 changed files
with
318 additions
and
22 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Local environment | ||
|
||
Setting up your local environment | ||
|
||
## Install Poetry | ||
|
||
The Runner manager uses [Poetry](https://python-poetry.org/), a Python packaging | ||
and dependency management. | ||
|
||
To install and use this project, please make sure you have poetry | ||
installed. Follow [poetry](https://python-poetry.org/docs/#installation) | ||
documentation for proper installation instruction. | ||
|
||
## Install dependencies | ||
|
||
```shell | ||
poetry install | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,86 @@ | ||
# Run it locally | ||
|
||
Before starting this guide: | ||
|
||
- Follow the [local setup](./local-setup.md) documentation. | ||
|
||
## Run | ||
|
||
Once everything is properly set up, you can launch the project | ||
with the following command at root level: | ||
|
||
```bash | ||
poetry run start | ||
``` | ||
|
||
The application is now launched and running on port 8000 of the machine. | ||
|
||
## Webhook setup | ||
|
||
### Ngrok setup | ||
|
||
As GitHub Actions Exporter depends on webhook coming from github to work properly. | ||
|
||
Ngrok can help you setup a public URL to be used with GitHub webhooks. | ||
|
||
You can install Ngrok on your Linux machine using the following command: | ||
|
||
```bash | ||
curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc | sudo tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null && echo "deb https://ngrok-agent.s3.amazonaws.com buster main" | sudo tee /etc/apt/sources.list.d/ngrok.list && sudo apt update && sudo apt install ngrok | ||
``` | ||
|
||
For more information, you can visit the Ngrok [website](https://ngrok.com/download). | ||
|
||
Once installed, you can run the following command to listen on port 8000 | ||
of the machine and assign a public URL to it. | ||
|
||
```shell | ||
ngrok http 8000 | ||
``` | ||
|
||
### Setting up the webhook | ||
|
||
Setup a webhook at the organization level, should be on a link like the following: | ||
`https://github.com/organizations/<your org>/settings/hooks` | ||
|
||
- Click on Add Webhook | ||
- In payload url, enter your ngrok url, like the following: | ||
`https://xxxxx.ngrok.io/webhook` | ||
- Content type: application/json | ||
- Click on `Let me select individual events.` | ||
- Select: `Workflow jobs` and `Workflow runs` | ||
- Save | ||
|
||
## Setting up your testing repo | ||
|
||
Create a new repository in the organization you have configured the runner manager. | ||
|
||
And push a workflow in the repository, here is an example: | ||
|
||
```yaml | ||
# .github/workflows/test-gh-actions-exporter.yaml | ||
--- | ||
name: test-gh-actions-exporter | ||
on: | ||
push: | ||
workflow_dispatch: | ||
jobs: | ||
greet: | ||
strategy: | ||
matrix: | ||
person: | ||
- foo | ||
- bar | ||
runs-on: | ||
- ubuntu | ||
- focal | ||
- large | ||
- gcloud | ||
steps: | ||
- name: sleep | ||
run: sleep 120 | ||
- name: Send greeting | ||
run: echo "Hello ${{ matrix.person }}!" | ||
``` | ||
Trigger builds and enjoy :beers: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,5 @@ | ||
# GitHub WebHook Exporter | ||
# GitHub Actions Exporter | ||
|
||
The idea of this exporter is to be able to expose this service to listen | ||
from WebHooks coming from GitHub. | ||
Then expose those metrics in OpenMetrics format for later usage. | ||
|
||
## Install | ||
|
||
To install and use this project, please make sure you have [poetry](https://python-poetry.org/) installed. | ||
|
||
Then run: | ||
```shell | ||
poetry install | ||
``` | ||
|
||
## Start | ||
|
||
To start the API locally you can use the following command: | ||
|
||
```shell | ||
poetry run start | ||
``` | ||
The GitHub Actions Exporter is a project used to retrieve information | ||
provided by GitHub, notably through Webhooks, process it, and store it | ||
via Prometheus. |
145 changes: 145 additions & 0 deletions
145
docs/metrics-analysis-prometheus/collected-reported-metrics.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,145 @@ | ||
# Collected and reported metrics | ||
|
||
In first place, it is important to differentiate the `workflow_run` | ||
and the `workflow_job` webhook events. | ||
|
||
The `workflow_run` request is triggered when a workflow run is `requested`, | ||
`in_progress`, `completed` or `failure`. However, for this project, we are not | ||
interested in the `cancelled` or `skipped` events, so we will ignore them. | ||
|
||
On the other hand, the `workflow_job` request is triggered when a | ||
workflow job is `queued`, `in_progress`, or `completed`. We will also ignore | ||
the `cancelled` or `skipped` events for `workflow_job` in this project. | ||
|
||
## Workflow run | ||
|
||
Here are the different metrics collected by the GitHub Actions Exporter | ||
project for workflow runs: | ||
|
||
The number of workflow rebuilds: `github_actions_workflow_rebuild_count`. | ||
|
||
The duration of a workflow in seconds: `github_actions_workflow_duration_seconds`. | ||
|
||
Count the number of workflows for each state: | ||
|
||
- `github_actions_workflow_failure_count` | ||
- `github_actions_workflow_success_count` | ||
- `github_actions_workflow_cancelled_count` | ||
- `github_actions_workflow_inprogress_count` | ||
- `github_actions_workflow_total_count` | ||
|
||
## Workflow job | ||
|
||
Here are the different metrics collected by the GitHub Actions | ||
Exporter project for workflows and jobs. | ||
|
||
The duration of a job in seconds: `github_actions_job_duration_seconds`. | ||
|
||
Time between when a job is requested and started: `github_actions_job_start_duration_seconds`. | ||
|
||
Count the number of jobs for each states: | ||
|
||
- `github_actions_job_failure_count` | ||
- `github_actions_job_success_count` | ||
- `github_actions_job_cancelled_count` | ||
- `github_actions_job_inprogress_count` | ||
- `github_actions_job_queued_count` | ||
- `github_actions_job_total_count` | ||
|
||
## Cost metric | ||
|
||
This is the last metric we collect, and it is one of the most important | ||
ones. It allows us to determine the cost of our CI runs. | ||
|
||
### Formula | ||
|
||
Here is the formula to calculate the cost over a period of time: | ||
|
||
```bash | ||
cost = duration (per second) / 60 * cost (per minute) | ||
``` | ||
|
||
### How do we find the cost per minute? | ||
|
||
#### GitHub | ||
|
||
As for GitHub, it is quite simple. They provide us with a fixed value, and | ||
the price never varies. To give an example, for `ubuntu-latest`, we have a cost | ||
of 0.008$/min, that's it. Easy! | ||
|
||
For larger GitHub hosted runners, such as the high-performance options, the | ||
pricing structure may differ. The exact details and costs associated with those | ||
specific runner types can be obtained from | ||
[GitHub's documentation](https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions). | ||
|
||
#### Self-Hosted | ||
|
||
When it comes to the cost of self-hosted runners, it's a bit more complicated. | ||
|
||
To calculate the costs of self-hosted runners, we can play the game of | ||
calculating for the main ones, namely AWS and Google Cloud Provider (GCP). | ||
|
||
The cost can be found based on the machine type in the Management Console | ||
for AWS (when creating an EC2 instance) and on the | ||
[Google Cloud website](https://cloud.google.com/compute/vm-instance-pricing) | ||
for GCP. | ||
|
||
Key points to consider for retrieving cost information: | ||
|
||
!!! note "Cost for self-hosted runners are approximate" | ||
|
||
When retrieving the cost of each key point, | ||
calculating the exact cost per minute might not be possible | ||
as it depends on the cloud provider billing policy | ||
and each individual CI workload: | ||
|
||
- Internal cloud provider/lab with dedicated hardware. | ||
- Cloud provider billing policy for virtual machines is per hour or day only. | ||
- Price of instance varies during the day, week or month. | ||
- CI job that uploads a large amount of data. | ||
|
||
- RAM and CPU Costs : provided cost per minute for RAM and CPU expenses, can | ||
be found in the documentation of the respective cloud provider. | ||
- Storage Costs : provided cost per minute for storage expenses, can | ||
be found in the documentation of the respective cloud provider. | ||
- Bandwidth Cost: Directly determining the cost per minute of bandwidth is | ||
not feasible. | ||
|
||
Calculating the bandwidth cost per minutes is up to the discretion of the | ||
user and will vary depending on the workload. As an example, adding an | ||
extra 30% is what we found by comparing the values in the documentation | ||
of different cloud providers (for CPU, RAM, and storage) with the actual | ||
values available on our invoices. Using this information, | ||
estimating the overall cost can be done using the following formula: | ||
(all costs are per minute) | ||
|
||
```bash | ||
cost = (cost_per_flavor + cost_per_storage) * percentage_cost_of_bandwidth | ||
``` | ||
|
||
!!! note | ||
|
||
GCP and AWS costs are quite the same for the same flavors. | ||
|
||
### The different tags and their associated cost | ||
|
||
| Provider | Runner | Cost ($ per min) | | ||
| -------- | -------------------- | ---------------- | | ||
| GitHub | `ubuntu-latest` | 0.008 | | ||
| GitHub | `ubuntu-18.04` | 0.008 | | ||
| GitHub | `ubuntu-20.04` | 0.008 | | ||
| GitHub | `ubuntu-22.04` | 0.008 | | ||
| GitHub | `ubuntu-20.04-4core` | 0.016 | | ||
| GitHub | `ubuntu-22.04-4core` | 0.016 | | ||
| GitHub | `ubuntu-22.04-8core` | 0.032 | | ||
| AWS | `t3.small` | 0.000625 | | ||
| GCP | `n2-standard-2` | 0.0025 | | ||
| AWS | `t3.large` | 0.0025 | | ||
| GCP | `n2-standard-4` | 0.005 | | ||
| GCP | `n2-standard-8` | 0.01 | | ||
|
||
!!! note | ||
|
||
Please note that the names of large GitHub hosted runners | ||
may not be explicitly the same as shown below, but this is | ||
the naming convention recommended by GitHub. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,56 @@ | ||
# Prometheus | ||
|
||
## Introduction | ||
|
||
Prometheus is a powerful open-source monitoring and alerting system that allows | ||
users to collect, store, and analyze time-series data. In this guide, we will | ||
explore how to effectively utilize Prometheus to analyze GitHub Actions. | ||
|
||
To collect and analyze GitHub Actions metrics, users need to have an existing | ||
Prometheus installation and configure it to pull metrics | ||
from the `/metrics` endpoint of the exporter. | ||
|
||
## Understanding Prometheus Queries | ||
|
||
The idea here is not to recreate the entire Prometheus documentation; we will | ||
simply discuss the key points to get you started easily without getting lost in | ||
the plethora of information available on the Internet. | ||
|
||
To learn more about Prometheus itself, checkout the official | ||
[documentation](https://prometheus.io/docs/introduction/overview/), | ||
as well as [querying Prometheus](https://prometheus.io/docs/prometheus/latest/querying/basics/). | ||
|
||
To proceed, I will take a typical query and break it down, discussing other | ||
potentially useful information to cover. | ||
|
||
Let's examining this example query: | ||
|
||
```bash | ||
topk(5, sum(increase(github_actions_job_cost_count_total{}[5m]])) by (repository) > 0) | ||
``` | ||
|
||
This query retrieves data related to GitHub Actions job costs and | ||
provides the top 5 repositories with the highest cumulative cost | ||
within a specified time range. | ||
|
||
1. The query starts with the topk(5, ...) function, which returns the | ||
top 5 values based on a specified metric or condition. | ||
2. The sum(increase(...)) part of the query calculates the cumulative | ||
sum of the specified metric. In our example, it calculates the | ||
cumulative sum of the github_actions_job_cost_count_total metric, | ||
representing the total job cost count. | ||
3. The `[5m]` part specifies the time range for the query. | ||
4. The `by (repository)` clause groups the data by the repository label. | ||
This enables the query to calculate the cost sum for each repository individually. | ||
5. The expression `> 0` filters the query results to only include | ||
repositories with a value greater than zero. | ||
|
||
!!! info | ||
|
||
Using Grafana enhances the visualization of Prometheus data and | ||
provides powerful querying capabilities. Within Grafana, apply filters, | ||
combine queries, and utilize variables for dynamic filtering. It's important | ||
to understand `__interval` (time interval between data points) and `__range` | ||
(selected time range) when working with Prometheus data in Grafana. This | ||
integration enables efficient data exploration and analysis for better | ||
insights and decision-making. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters