scality · gaspardmoindrot · Jun 27, 2023 · Jun 15, 2023 · Jun 15, 2023 · Jun 15, 2023
@@ -3,6 +3,3 @@
 The GitHub Actions Exporter is a project used to retrieve information
 provided by GitHub, notably through Webhooks, process it, and store it
 via Prometheus.
-
-The main idea of this exporter is to be able to expose this service to
-listen from WebHooks coming from GitHub.
@@ -1,16 +1,15 @@
 # Collected and reported metrics
 
-The idea behind this repository is to gather as much information as
-possible from the requests sent by GitHub via the Webhook.
-
 In first place, it is important to differentiate the `workflow_run`
-and the `workflow_job` API requests.
+and the `workflow_job` webhook events.
 
 The `workflow_run` request is triggered when a workflow run is `requested`,
-`in_progress`, or `completed`.
+`in_progress`, `completed` or `failure`. However, for this project, we are not
+interested in the `cancelled` or `skipped` events, so we will ignore them.
 
 On the other hand, the `workflow_job` request is triggered when a
-workflow job is `queued`, `in_progress`, or `completed`.
+workflow job is `queued`, `in_progress`, or `completed`. We will also ignore
+the `cancelled` or `skipped` events for `workflow_job` in this project.
 
 ## Workflow run
 
@@ -52,9 +51,9 @@ Count the number of jobs for each states:
 This is the last metric we collect, and it is one of the most important
 ones. It allows us to determine the cost of our CI runs.
 
-### The formula to calculate the cost over a period of time
+### Formula
 
-To calculate this metric, we use the following formula:
+Here is the formula to calculate the cost over a period of time:
 
 ```bash
 cost = duration (per second) / 60 * cost (per minute)
@@ -68,6 +67,11 @@ As for GitHub, it is quite simple. They provide us with a fixed value, and
 the price never varies. To give an example, for `ubuntu-latest`, we have a cost
 of 0.008$/min, that's it. Easy!
 
+For larger GitHub hosted runners, such as the high-performance options, the
+pricing structure may differ. The exact details and costs associated with those
+specific runner types can be obtained from
+[GitHub's documentation](https://docs.github.com/en/billing/managing-billing-for-github-actions/about-billing-for-github-actions).
+
 #### Self-Hosted
 
 When it comes to the cost of self-hosted runners, it's a bit more complicated.
@@ -80,24 +84,32 @@ for AWS (when creating an EC2 instance) and on the
 [Google Cloud website](https://cloud.google.com/compute/vm-instance-pricing)
 for GCP.
 
-Unfortunately, these values are not accurate as they lack several elements
-such as bandwidth or storage. As for storage costs, they can be found in
-the same places where the machine type cost is available. However, it is
-not possible to determine the bandwidth cost directly.
-
-To overcome this, we had to devise a workaround. We didn't necessarily
-need an exact cost for CI but rather a value close to reality (+/- 5%)
-for data visualization purposes.
-
-We analyzed previous invoices and calculated the additional cost generated
-by bandwidth, which amounted to approximately 30% for each month.
-Consequently, we were able to approximate the cost using the following formula:
+We aim to obtain a result that is close to reality, within a range of
+approximately +/- 5%, for data visualization purposes.
+Key points to consider for retrieving cost information:
+
+- RAM and CPU Costs : provided cost per minute for RAM and CPU expenses, can
+  be found in the documentation of the respective cloud provider.
+- Storage Costs : provided cost per minute for storage expenses, can
+  be found in the documentation of the respective cloud provider.
+- Bandwidth Cost: Directly determining the cost per minute of bandwidth is
+  not feasible.
+
+Calculating the bandwidth cost per minutes is up to the discretion of the
+user and will vary depending on the workload. As an example, adding an
+extra 30% is what we found by comparing the values in the documentation
+of different cloud providers (for CPU, RAM, and storage) with the actual
+values available on our invoices. Using this information, we were able
+to estimate the overall cost using the following formula:
+(all costs are per minute)
 
 ```bash
-cost = (cost_per_flavor + cost_per_storage) * 130 / 100
+cost = (cost_per_flavor + cost_per_storage) * percentage_cost_of_bandwidth
 ```
 
-_Good news, GCP and AWS costs are quite the same for the same flavors._
+!!! note
+
+    GCP and AWS costs are quite the same for the same flavors.
 
 ### The different tags and their associated cost
 
@@ -115,3 +127,9 @@ _Good news, GCP and AWS costs are quite the same for the same flavors._
 | AWS      | `t3.large`           | 0.0025           |
 | GCP      | `n2-standard-4`      | 0.005            |
 | GCP      | `n2-standard-8`      | 0.01             |
+
+!!! note
+
+    Please note that the names of large GitHub hosted runners
+    may not be explicitly the same as shown below, but this is
+    the naming convention recommended by GitHub.
@@ -2,28 +2,30 @@
 
 ## Introduction
 
-Prometheus is integrated with our `gh_actions_exporter` repository,
-enabling the export of GitHub Actions metrics.
-
 Prometheus is a powerful open-source monitoring and alerting system that allows
 users to collect, store, and analyze time-series data. In this guide, we will
 explore how to effectively utilize Prometheus to analyze GitHub Actions.
 
+In order to collect and analyze GitHub Actions metrics, users are expected
+to have an existing Prometheus installation and configure it to pull metrics.
+
 ## Understanding Prometheus Queries
 
 The idea here is not to recreate the entire Prometheus documentation; we will
 simply discuss the key points to get you started easily without getting lost in
-the plethora of information available on the Internet. I will redirect you to
-the [documentation](https://prometheus.io/docs/introduction/overview/)
-if you want to develop deeper.
+the plethora of information available on the Internet.
+
+To learn more about Prometheus itself, checkout the official
+[documentation](https://prometheus.io/docs/introduction/overview/),
+as well as [querying Prometheus](https://prometheus.io/docs/prometheus/latest/querying/basics/).
 
 To proceed, I will take a typical query and break it down, discussing other
 potentially useful information to cover.
 
 Let's examining this example query:
 
 ```bash
-topk(5, sum(increase(github_actions_job_cost_count_total{repository=~"$repository", runner_type=~"$runner_type", repository_visibility=~"$repository_visibility", cloud=~"$cloud"}[$__range])) by (repository) > 0)
+topk(5, sum(increase(github_actions_job_cost_count_total{}[5m]])) by (repository) > 0)
 ```
 
 This query retrieves data related to GitHub Actions job costs and
@@ -36,20 +38,18 @@ within a specified time range.
    sum of the specified metric. In our example, it calculates the
    cumulative sum of the github_actions_job_cost_count_total metric,
    representing the total job cost count.
-3. Within the curly braces {}, we apply filters to narrow down the
-   data based on specific criteria. The `$variable` refers to the filter
-   variables that you can specify at the top of the page.
-4. The `[$__range]` part specifies the time range for the query.
-   It uses the `$__range` variable, which represents the selected time
-   range in Grafana.
-5. The `by (repository)` clause groups the data by the repository field.
+3. The `[5m]` part specifies the time range for the query.
+4. The `by (repository)` clause groups the data by the repository label.
    This enables the query to calculate the cost sum for each repository individually.
-6. The expression `> 0` filters the query results to only include
+5. The expression `> 0` filters the query results to only include
    repositories with a value greater than zero.
 
-It's also possible to combine different queries in Grafana. For example, one
-query dividing by another.
+!!! info
 
-Lastly, it is also important to understand `__interval`. `__interval`
-represents the time interval between data points, whereas `__range`
-represents the selected time range for the query.
+      Using Grafana enhances the visualization of Prometheus data and
+      provides powerful querying capabilities. Within Grafana, apply filters,
+      combine queries, and utilize variables for dynamic filtering. It's important
+      to understand `__interval` (time interval between data points) and `__range`
+      (selected time range) when working with Prometheus data in Grafana. This
+      integration enables efficient data exploration and analysis for better
+      insights and decision-making.