scality · gaspardmoindrot · Jun 27, 2023 · Jun 15, 2023 · Jun 15, 2023 · Jun 15, 2023
@@ -0,0 +1,18 @@
+# Local environment
+
+Setting up your local environment
+
+## Install Poetry
+
+The Runner manager uses [Poetry](https://python-poetry.org/), a Python packaging
+and dependency management.
+
+To install and use this project, please make sure you have poetry
+installed. Follow [poetry](https://python-poetry.org/docs/#installation)
+documentation for proper installation instruction.
+
+## Install dependencies
+
+```shell
+$ poetry install
+```
@@ -0,0 +1,84 @@
+# Run it locally
+
+Before starting this guide:
+
+* Follow the [local setup](./local-setup.md) documentation.
+
+## Run
+
+Once everything is properly set up, you can launch the project
+with the following command at root level:
+
+```bash
+poetry run start
+```
+
+The application is now launched and running on port 8000 of the machine.
+
+## Webhook setup
+
+### Ngrok setup
+
+As GitHub Actions Exporter depends on webhook coming from github to work properly.
+
+Ngrok can help you setup a public URL to be used with GitHub webhooks.
+
+You can install Ngrok on your Linux machine using the following command:
+```bash
+curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc | sudo tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null && echo "deb https://ngrok-agent.s3.amazonaws.com buster main" | sudo tee /etc/apt/sources.list.d/ngrok.list && sudo apt update && sudo apt install ngrok
+```
+For more information, you can visit the Ngrok [website](https://ngrok.com/download).
+
+Once installed, you can run the following command to listen on port 8000
+of the machine and assign a public URL to it.
+
+```
+ngrok http 8000
+```
+
+### Setting up the webhook
+
+Setup a webhook at the organization level, should be on a link like the following:
+`https://github.com/organizations/<your org>/settings/hooks`
+
+* Click on Add Webhook
+* In payload url, enter your ngrok url, like the following:
+`https://xxxxx.ngrok.io/webhook`
+* Content type: application/json
+* Click on `Let me select individual events.`
+* Select: `Workflow jobs` and `Workflow runs`
+* Save
+
+## Setting up your testing repo
+
+Create a new repository in the organization you have configured the runner manager.
+
+And push a workflow in the repository, here is an example:
+
+```yaml
+# .github/workflows/test-gh-actions-exporter.yaml
+---
+name: test-gh-actions-exporter
+on:
+  push:
+  workflow_dispatch:
+jobs:
+  greet:
+    strategy:
+      matrix:
+        person:
+        - foo
+        - bar
+    runs-on:
+    - ubuntu
+    - focal
+    - large
+    - gcloud
+    steps:
+    - name: sleep
+      run: sleep 120
+    - name: Send greeting
+      run: echo "Hello ${{ matrix.person }}!"
+```
+
+Trigger builds and enjoy :beers:
@@ -1,22 +1,9 @@
-# GitHub WebHook Exporter
+# GitHub Actions Exporter
 
-The idea of this exporter is to be able to expose this service to listen
-from WebHooks coming from GitHub.
-Then expose those metrics in OpenMetrics format for later usage.
+The GitHub Actions Exporter is a project used to retrieve information
+provided by GitHub, notably through Webhooks, process it, and store it
+via Prometheus. Later on, Grafana is employed to display and visualize
+the data in graphs.
 
-## Install
-
-To install and use this project, please make sure you have [poetry](https://python-poetry.org/) installed.
-
-Then run:
-```shell
-poetry install
-```
-
-## Start
-
-To start the API locally you can use the following command:
-
-```shell
-poetry run start
-```
+The main idea of this exporter is to be able to expose this service to
+listen from WebHooks coming from GitHub.
diff --git a/docs/setup/collected-reported-metrics.md b/docs/setup/collected-reported-metrics.md
@@ -0,0 +1,117 @@
+# Collected and reported metrics
+
+The idea behind this repository is to gather as much information as
+possible from the requests sent by GitHub via the Webhook.
+
+In first place, it is important to differentiate the `workflow_run`
+and the `workflow_job` API requests.
+
+The `workflow_run` request is triggered when a workflow run is `requested`,
+`in_progress`, or `completed`.
+
+On the other hand, the `workflow_job` request is triggered when a
+workflow job is `queued`, `in_progress`, or `completed`.
+
+## Workflow run
+
+Here are the different metrics collected by the GitHub Actions Exporter
+project for workflow runs:
+
+The number of workflow rebuilds: `github_actions_workflow_rebuild_count`.
+
+The duration of a workflow in seconds: `github_actions_workflow_duration_seconds`.
+
+Count the number of workflows for each state:
+
+- `github_actions_workflow_failure_count`
+- `github_actions_workflow_success_count`
+- `github_actions_workflow_cancelled_count`
+- `github_actions_workflow_inprogress_count`
+- `github_actions_workflow_total_count`
+
+## Workflow job
+
+Here are the different metrics collected by the GitHub Actions
+Exporter project for workflows and jobs.
+
+The duration of a job in seconds: `github_actions_job_duration_seconds`.
+
+Time between when a job is requested and started: `github_actions_job_start_duration_seconds`.
+
+Count the number of jobs for each states:
+
+- `github_actions_job_failure_count`
+- `github_actions_job_success_count`
+- `github_actions_job_cancelled_count`
+- `github_actions_job_inprogress_count`
+- `github_actions_job_queued_count`
+- `github_actions_job_total_count`
+
+## Cost metric
+
+This is the last metric we collect, and it is one of the most important
+ones. It allows us to determine the cost of our CI runs.
+
+### The formula to calculate the cost over a period of time
+
+To calculate this metric, we use the following formula:
+
+```
+cost = duration (per second) / 60 * cost (per minute)
+```
+
+### How do we find the cost per minute?
+
+#### GitHub
+
+As for GitHub, it is quite simple. They provide us with a fixed value, and
+the price never varies. To give an example, for `ubuntu-latest`, we have a cost
+of 0.008$/min, that's it. Easy!
+
+#### Self-Hosted
+
+When it comes to the cost of self-hosted runners, it's a bit more complicated.
+
+Self-hosted runners include Google Cloud Provider (GCP) and AWS.
+
+The cost can be found based on the machine type in the Management Console
+for AWS (when creating an EC2 instance) and on the
+[Google Cloud website](https://cloud.google.com/compute/vm-instance-pricing)
+for GCP.
+
+Unfortunately, these values are not accurate as they lack several elements
+such as bandwidth or storage. As for storage costs, they can be found in
+the same places where the machine type cost is available. However, it is
+not possible to determine the bandwidth cost directly.
+
+To overcome this, we had to devise a workaround. We didn't necessarily
+need an exact cost for CI but rather a value close to reality (+/- 5%)
+for data visualization purposes.
+
+We analyzed previous invoices and calculated the additional cost generated
+by bandwidth, which amounted to approximately 30% for each month.
+Consequently, we were able to approximate the cost using the following formula:
+
+```
+cost = (cost_per_flavor + cost_per_storage) * 130 / 100
+```
+
+_Good news, GCP and AWS costs are quite the same for the same flavors._
+
+### The different tags and their associated cost
+
+| Provider                           | Runner                      | Cost ($ per min)   |
+| ---------------------------------- | --------------------------- | ------------------ |
+| GitHub                             | `ubuntu-latest`             | 0.008              |
+| GitHub                             | `ubuntu-18.04`              | 0.008              |
+| GitHub                             | `ubuntu-20.04`              | 0.008              |
+| GitHub                             | `ubuntu-22.04`              | 0.008              |
+| GitHub                             | `ubuntu-20.04-4core`        | 0.016              |
+| GitHub                             | `ubuntu-22.04-4core`        | 0.016              |
+| GitHub                             | `ubuntu-22.04-8core`        | 0.032              |
+| AWS & GCP                          | `small`                     | 0.000625           |
+| AWS & GCP                          | `medium`                    | 0.00125            |
+| AWS & GCP                          | `large`                     | 0.0025             |
+| AWS & GCP                          | `xlarge`                    | 0.005              |
+| AWS & GCP                          | `3xlarge`                   | 0.01               |
+| GCP                                | `large-nested`              | 0.0025             |
diff --git a/docs/setup/utilizing-grafana.md b/docs/setup/utilizing-grafana.md
@@ -0,0 +1,90 @@
+# Utilizing Grafana
+
+Grafana is a powerful open-source platform that allows users to visualize
+and monitor various metrics and data sources. In this guide, we will
+explore how to effectively use Grafana to monitor and analyze GitHub Actions.
+
+Grafana can be accessed through
+[the following URL](https://mon.scality.net/grafana/d/WidbLgPnk/).
+Once logged in, you will find two distinct boards: "GitHub Actions Costs"
+and "GitHub Actions Monitoring." These boards provide valuable insights
+into cost monitoring and general information about GitHub Actions.
+
+## As a user
+
+When accessing the Grafana URL, users can easily sort and filter the displayed
+data using variables located at the top of the page. This functionality
+allows more precise and targeted information based on specific criteria.
+
+Available Sorting Options:
+
+- `repository`: GitHub repository
+- `workflow_name`: Name of the workflow
+- `repository_visibility`: Public or private
+- `job_name`: Name of the job
+- `runner_type`: GitHub or self-hosted
+- `cloud`: GCloud / AWS or GitHub
+
+Note that additional variables may be added in the future to enhance the
+monitoring capabilities further.
+
+It is possible to test and display specific information easily by
+accessing the ["Explore" tab](https://mon.scality.net/grafana/explore)
+in Grafana. You will need to enter a query there, as explained in
+the "As a developer" section below.
+
+## As a developper
+
+Grafana provides a flexible platform for customization and extensibility.
+Developpers can create their own dashboards, incorporating specific metrics and
+visualizations tailored to their unique requirements.
+
+I will not provide a comprehensive guide on Grafana; instead, I will
+focus on key points to help you get started. I encourage you to refer
+to [Grafana's documentation](https://grafana.com/docs/grafana/latest/dashboards/)
+for more detailed information.
+
+There are various types of graphs available in Grafana, including line charts,
+bar charts, pie charts, and more. I encourage you to explore and experiment
+with different graph types by referring to the documentation, exploring
+existing graphs, or simply trying them out yourself.
+
+### Query
+
+First of all, to create a graph, start by clicking on the "Add panel" button
+at the top of the page. Another option is to duplicate another graph if
+you want to make a similar one.
+
+Let's begin by examining an example query:
+
+```
+topk(5, sum(increase(github_actions_job_cost_count_total{repository=~"$repository", runner_type=~"$runner_type", repository_visibility=~"$repository_visibility", cloud=~"$cloud"}[$__range])) by (repository) > 0)
+```
+
+This query retrieves data related to GitHub Actions job costs and
+provides the top 5 repositories with the highest cumulative cost
+within a specified time range.
+
+1. The query starts with the topk(5, ...) function, which returns the
+top 5 values based on a specified metric or condition.
+2. The sum(increase(...)) part of the query calculates the cumulative
+sum of the specified metric. In our example, it calculates the
+cumulative sum of the github_actions_job_cost_count_total metric,
+representing the total job cost count.
+3. Within the curly braces {}, we apply filters to narrow down the
+data based on specific criteria. The `$variable` refers to the filter
+variables that you can specify at the top of the page.
+4. The `[$__range]` part specifies the time range for the query.
+It uses the `$__range` variable, which represents the selected time
+range in Grafana.
+5. The `by (repository)` clause groups the data by the repository field.
+This enables the query to calculate the cost sum for each repository individually.
+6. The expression `> 0` filters the query results to only include
+repositories with a value greater than zero.
+
+It's also possible to combine different queries in Grafana allowing you
+to make other graphs. For example, one query dividing by another.
+
+Lastly, it is also important to understand `__interval`. `__interval`
+represents the time interval between data points, whereas `__range`
+represents the selected time range for the query.
diff --git a/docs/tags.md b/docs/tags.md
@@ -0,0 +1,30 @@
+# Tags
+
+In addition to Grafana, we also have tags that are set on AWS and GCP to
+visualize data. These tags allow us to navigate their respective consoles
+and filter data using specific tags, providing more accurate figures
+compared to Grafana, which has an approximate margin of error of +/- 5%.
+
+## AWS
+
+For AWS, we have the following tags (specified in the configuration file
+found in the `devinfra` repository):
+
+- Name of the runner
+- `lifecycle_autostop`: `no`
+- `lifecycle_autostart`: `no`
+- `map-migrated`: `mig42992`
+- `owner`: `ci`
+- `tool`: `runner-manager`
+
+## GCloud
+
+For GCP, we have the following tags (retrieve dynamic tags using the
+`gh_actions_exporter` repository via webhooks):
+
+- machine_type
+- image
+- status
+- repository
+- workflow
+- job
@@ -40,6 +40,16 @@ theme:
 nav:
 - Home: index.md
 
+- Getting Started:
+  - Local Setup: getting-started/local-setup.md
+  - Run it Locally: getting-started/run-it-locally.md
+
+- Setting up graphs:
+  - Collected and reported metrics: setup/collected-reported-metrics.md
+  - Utilizing Grafana: setup/utilizing-grafana.md
+
+- Tags on runners' VMs: tags.md
+
 markdown_extensions:
   - pymdownx.highlight:
       anchor_linenums: true