[RELENG-7422] 📝 Add documentation (#42)

scality · Jun 27, 2023 · 44ba76b · 44ba76b
1 parent a95e58d
commit 44ba76b
Show file tree

Hide file tree

Showing 6 changed files with 318 additions and 22 deletions.
diff --git a/docs/getting-started/local-setup.md b/docs/getting-started/local-setup.md
@@ -0,0 +1,18 @@
+# Local environment
+
+Setting up your local environment
+
+## Install Poetry
+
+The Runner manager uses [Poetry](https://python-poetry.org/), a Python packaging
+and dependency management.
+
+To install and use this project, please make sure you have poetry
+installed. Follow [poetry](https://python-poetry.org/docs/#installation)
+documentation for proper installation instruction.
+
+## Install dependencies
+
+```shell
+poetry install
+```
diff --git a/docs/getting-started/run-it-locally.md b/docs/getting-started/run-it-locally.md
@@ -0,0 +1,86 @@
+# Run it locally
+
+Before starting this guide:
+
+- Follow the [local setup](./local-setup.md) documentation.
+
+## Run
+
+Once everything is properly set up, you can launch the project
+with the following command at root level:
+
+```bash
+poetry run start
+```
+
+The application is now launched and running on port 8000 of the machine.
+
+## Webhook setup
+
+### Ngrok setup
+
+As GitHub Actions Exporter depends on webhook coming from github to work properly.
+
+Ngrok can help you setup a public URL to be used with GitHub webhooks.
+
+You can install Ngrok on your Linux machine using the following command:
+
+```bash
+curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc | sudo tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null && echo "deb https://ngrok-agent.s3.amazonaws.com buster main" | sudo tee /etc/apt/sources.list.d/ngrok.list && sudo apt update && sudo apt install ngrok
+```
+
+For more information, you can visit the Ngrok [website](https://ngrok.com/download).
+
+Once installed, you can run the following command to listen on port 8000
+of the machine and assign a public URL to it.
+
+```shell
+ngrok http 8000
+```
+
+### Setting up the webhook
+
+Setup a webhook at the organization level, should be on a link like the following:
+`https://github.com/organizations/<your org>/settings/hooks`
+
+- Click on Add Webhook
+- In payload url, enter your ngrok url, like the following:
+  `https://xxxxx.ngrok.io/webhook`
+- Content type: application/json
+- Click on `Let me select individual events.`
+- Select: `Workflow jobs` and `Workflow runs`
+- Save
+
+## Setting up your testing repo
+
+Create a new repository in the organization you have configured the runner manager.
+
+And push a workflow in the repository, here is an example:
+
+```yaml
+# .github/workflows/test-gh-actions-exporter.yaml
+---
+name: test-gh-actions-exporter
+on:
+  push:
+  workflow_dispatch:
+jobs:
+  greet:
+    strategy:
+      matrix:
+        person:
+          - foo
+          - bar
+    runs-on:
+      - ubuntu
+      - focal
+      - large
+      - gcloud
+    steps:
+      - name: sleep
+        run: sleep 120
+      - name: Send greeting
+        run: echo "Hello ${{ matrix.person }}!"
+```
+
+Trigger builds and enjoy :beers:
diff --git a/docs/index.md b/docs/index.md
@@ -1,22 +1,5 @@
-# GitHub WebHook Exporter
+# GitHub Actions Exporter
 
-The idea of this exporter is to be able to expose this service to listen
-from WebHooks coming from GitHub.
-Then expose those metrics in OpenMetrics format for later usage.
-
-## Install
-
-To install and use this project, please make sure you have [poetry](https://python-poetry.org/) installed.
-
-Then run:
-```shell
-poetry install
-```
-
-## Start
-
-To start the API locally you can use the following command:
-
-```shell
-poetry run start
-```
+The GitHub Actions Exporter is a project used to retrieve information
+provided by GitHub, notably through Webhooks, process it, and store it
+via Prometheus.
diff --git a/docs/metrics-analysis-prometheus/collected-reported-metrics.md b/docs/metrics-analysis-prometheus/collected-reported-metrics.md
@@ -0,0 +1,145 @@
+# Collected and reported metrics
+
+In first place, it is important to differentiate the `workflow_run`
+and the `workflow_job` webhook events.
+
+The `workflow_run` request is triggered when a workflow run is `requested`,
+`in_progress`, `completed` or `failure`. However, for this project, we are not
+interested in the `cancelled` or `skipped` events, so we will ignore them.
+
+On the other hand, the `workflow_job` request is triggered when a
+workflow job is `queued`, `in_progress`, or `completed`. We will also ignore
+the `cancelled` or `skipped` events for `workflow_job` in this project.
+
+## Workflow run
+
+Here are the different metrics collected by the GitHub Actions Exporter
+project for workflow runs:
+
+The number of workflow rebuilds: `github_actions_workflow_rebuild_count`.
+
+The duration of a workflow in seconds: `github_actions_workflow_duration_seconds`.
+
+Count the number of workflows for each state:
+
+- `github_actions_workflow_failure_count`
+- `github_actions_workflow_success_count`
+- `github_actions_workflow_cancelled_count`
+- `github_actions_workflow_inprogress_count`
+- `github_actions_workflow_total_count`
+
+## Workflow job
+
+Here are the different metrics collected by the GitHub Actions
+Exporter project for workflows and jobs.
+
+The duration of a job in seconds: `github_actions_job_duration_seconds`.
+
+Time between when a job is requested and started: `github_actions_job_start_duration_seconds`.
+
+Count the number of jobs for each states:
+
+- `github_actions_job_failure_count`
+- `github_actions_job_success_count`
+- `github_actions_job_cancelled_count`
+- `github_actions_job_inprogress_count`
+- `github_actions_job_queued_count`
+- `github_actions_job_total_count`
+
+## Cost metric
+
+This is the last metric we collect, and it is one of the most important
+ones. It allows us to determine the cost of our CI runs.
+
+### Formula
+
+Here is the formula to calculate the cost over a period of time:
+
+```bash
+cost = duration (per second) / 60 * cost (per minute)
+```
+
+### How do we find the cost per minute?
+
+#### GitHub
+
+As for GitHub, it is quite simple. They provide us with a fixed value, and
+the price never varies. To give an example, for `ubuntu-latest`, we have a cost
+of 0.008$/min, that's it. Easy!
+
+For larger GitHub hosted runners, such as the high-performance options, the
+pricing structure may differ. The exact details and costs associated with those
+specific runner types can be obtained from
+[GitHub's documentation](https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions).
+
+#### Self-Hosted
+
+When it comes to the cost of self-hosted runners, it's a bit more complicated.
+
+To calculate the costs of self-hosted runners, we can play the game of
+calculating for the main ones, namely AWS and Google Cloud Provider (GCP).
+
+The cost can be found based on the machine type in the Management Console
+for AWS (when creating an EC2 instance) and on the
+[Google Cloud website](https://cloud.google.com/compute/vm-instance-pricing)
+for GCP.
+
+Key points to consider for retrieving cost information:
+
+!!! note "Cost for self-hosted runners are approximate"
+
+    When retrieving the cost of each key point,
+    calculating the exact cost per minute might not be possible
+    as it depends on the cloud provider billing policy
+    and each individual CI workload:
+
+    - Internal cloud provider/lab with dedicated hardware.
+    - Cloud provider billing policy for virtual machines is per hour or day only.
+    - Price of instance varies during the day, week or month.
+    - CI job that uploads a large amount of data.
+
+- RAM and CPU Costs : provided cost per minute for RAM and CPU expenses, can
+  be found in the documentation of the respective cloud provider.
+- Storage Costs : provided cost per minute for storage expenses, can
+  be found in the documentation of the respective cloud provider.
+- Bandwidth Cost: Directly determining the cost per minute of bandwidth is
+  not feasible.
+
+Calculating the bandwidth cost per minutes is up to the discretion of the
+user and will vary depending on the workload. As an example, adding an
+extra 30% is what we found by comparing the values in the documentation
+of different cloud providers (for CPU, RAM, and storage) with the actual
+values available on our invoices. Using this information,
+estimating the overall cost can be done using the following formula:
+(all costs are per minute)
+
+```bash
+cost = (cost_per_flavor + cost_per_storage) * percentage_cost_of_bandwidth
+```
+
+!!! note
+
+    GCP and AWS costs are quite the same for the same flavors.
+
+### The different tags and their associated cost
+
+| Provider | Runner               | Cost ($ per min) |
+| -------- | -------------------- | ---------------- |
+| GitHub   | `ubuntu-latest`      | 0.008            |
+| GitHub   | `ubuntu-18.04`       | 0.008            |
+| GitHub   | `ubuntu-20.04`       | 0.008            |
+| GitHub   | `ubuntu-22.04`       | 0.008            |
+| GitHub   | `ubuntu-20.04-4core` | 0.016            |
+| GitHub   | `ubuntu-22.04-4core` | 0.016            |
+| GitHub   | `ubuntu-22.04-8core` | 0.032            |
+| AWS      | `t3.small`           | 0.000625         |
+| GCP      | `n2-standard-2`      | 0.0025           |
+| AWS      | `t3.large`           | 0.0025           |
+| GCP      | `n2-standard-4`      | 0.005            |
+| GCP      | `n2-standard-8`      | 0.01             |
+
+!!! note
+
+    Please note that the names of large GitHub hosted runners
+    may not be explicitly the same as shown below, but this is
+    the naming convention recommended by GitHub.
diff --git a/docs/metrics-analysis-prometheus/prometheus.md b/docs/metrics-analysis-prometheus/prometheus.md
@@ -0,0 +1,56 @@
+# Prometheus
+
+## Introduction
+
+Prometheus is a powerful open-source monitoring and alerting system that allows
+users to collect, store, and analyze time-series data. In this guide, we will
+explore how to effectively utilize Prometheus to analyze GitHub Actions.
+
+To collect and analyze GitHub Actions metrics, users need to have an existing
+Prometheus installation and configure it to pull metrics
+from the `/metrics` endpoint of the exporter.
+
+## Understanding Prometheus Queries
+
+The idea here is not to recreate the entire Prometheus documentation; we will
+simply discuss the key points to get you started easily without getting lost in
+the plethora of information available on the Internet.
+
+To learn more about Prometheus itself, checkout the official
+[documentation](https://prometheus.io/docs/introduction/overview/),
+as well as [querying Prometheus](https://prometheus.io/docs/prometheus/latest/querying/basics/).
+
+To proceed, I will take a typical query and break it down, discussing other
+potentially useful information to cover.
+
+Let's examining this example query:
+
+```bash
+topk(5, sum(increase(github_actions_job_cost_count_total{}[5m]])) by (repository) > 0)
+```
+
+This query retrieves data related to GitHub Actions job costs and
+provides the top 5 repositories with the highest cumulative cost
+within a specified time range.
+
+1. The query starts with the topk(5, ...) function, which returns the
+   top 5 values based on a specified metric or condition.
+2. The sum(increase(...)) part of the query calculates the cumulative
+   sum of the specified metric. In our example, it calculates the
+   cumulative sum of the github_actions_job_cost_count_total metric,
+   representing the total job cost count.
+3. The `[5m]` part specifies the time range for the query.
+4. The `by (repository)` clause groups the data by the repository label.
+   This enables the query to calculate the cost sum for each repository individually.
+5. The expression `> 0` filters the query results to only include
+   repositories with a value greater than zero.
+
+!!! info
+
+    Using Grafana enhances the visualization of Prometheus data and
+    provides powerful querying capabilities. Within Grafana, apply filters,
+    combine queries, and utilize variables for dynamic filtering. It's important
+    to understand `__interval` (time interval between data points) and `__range`
+    (selected time range) when working with Prometheus data in Grafana. This
+    integration enables efficient data exploration and analysis for better
+    insights and decision-making.
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -38,7 +38,15 @@ theme:
     code: Roboto Mono
 
 nav:
-- Home: index.md
+  - Home: index.md
+
+  - Getting Started:
+      - Local Setup: getting-started/local-setup.md
+      - Run it Locally: getting-started/run-it-locally.md
+
+  - Metrics Analysis and Prometheus Monitoring for GitHub Actions:
+      - Collected and reported metrics: metrics-analysis-prometheus/collected-reported-metrics.md
+      - Prometheus Monitoring: metrics-analysis-prometheus/prometheus.md
 
 markdown_extensions:
   - pymdownx.highlight: