Skip to content

Commit

Permalink
docs: Update documentation for CEEMS LB
Browse files Browse the repository at this point in the history
Signed-off-by: Mahendra Paipuri <[email protected]>
  • Loading branch information
mahendrapaipuri committed Dec 30, 2024
1 parent 130c54d commit 87ee3af
Show file tree
Hide file tree
Showing 10 changed files with 179 additions and 49 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ managers (SLURM, Openstack, k8s)
- Provides targets using [HTTP Discovery Component](https://grafana.com/docs/alloy/latest/reference/components/discovery/discovery.http/)
to [Grafana Alloy](https://grafana.com/docs/alloy/latest) to continuously profile compute units
- Realtime access to metrics *via* Grafana dashboards
- Access control to Prometheus datasource in Grafana
- Access control to Prometheus and Pyroscope datasources in Grafana
- Stores aggregated metrics in a separate DB that can be retained for long time
- CEEMS apps are [capability aware](https://tbhaxor.com/understanding-linux-capabilities/)

Expand Down
4 changes: 2 additions & 2 deletions build/config/ceems_lb/ceems_lb.yml
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,15 @@
#
---
ceems_lb:
# Load balancing strategy. Three possibilites
# Load balancing strategy. Two possibilites
#
# - round-robin
# - least-connection
#
# Round robin and least connection are classic strategies and are
# self explanatory.
#
strategy: resource-based
strategy: round-robin

# List of backends for each cluster
#
Expand Down
3 changes: 3 additions & 0 deletions build/package/ceems_exporter/ceems_exporter.service
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,9 @@ StartLimitInterval=0

ProtectHome=read-only

# CEEMS Exporter is capability aware which means it drops all unnecessary capabilities based on
# runtime configuration. Thus, all these capabilities will not set on actual process if
# the collectors that do need them are not enabled.
AmbientCapabilities=CAP_SYS_PTRACE CAP_DAC_READ_SEARCH CAP_SETUID CAP_SETGID CAP_DAC_OVERRIDE CAP_BPF CAP_PERFMON CAP_SYS_RESOURCE
CapabilityBoundingSet=CAP_SYS_PTRACE CAP_DAC_READ_SEARCH CAP_SETUID CAP_SETGID CAP_DAC_OVERRIDE CAP_BPF CAP_PERFMON CAP_SYS_RESOURCE

Expand Down
2 changes: 1 addition & 1 deletion pkg/lb/base/base.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ const CEEMSLoadBalancerAppName = "ceems_lb"
// CEEMSLoadBalancerApp is kingpin CLI app.
var CEEMSLoadBalancerApp = *kingpin.New(
CEEMSLoadBalancerAppName,
"Prometheus load balancer to query from different instances.",
"CEEMS load balancer for TSDB and Pyroscope servers with access control support.",
)

// Backend defines backend server.
Expand Down
13 changes: 8 additions & 5 deletions website/docs/00-introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,28 +31,31 @@ managers (SLURM, Openstack, k8s)
- Provides targets using [HTTP Discovery Component](https://grafana.com/docs/alloy/latest/reference/components/discovery/discovery.http/)
to [Grafana Alloy](https://grafana.com/docs/alloy/latest) to continuously profile compute units
- Realtime access to metrics *via* Grafana dashboards
- Access control to Prometheus datasource in Grafana
- Access control to Prometheus and Pyroscope datasources in Grafana
- Stores aggregated metrics in a separate DB that can be retained for long time
- CEEMS apps are [capability aware](https://tbhaxor.com/understanding-linux-capabilities/)

## Components

CEEMS provide a set of components that enable operators to monitor the consumption of
CEEMS provide a set of components that enable operators and end users to monitor the consumption of
resources of the compute units of different resource managers like SLURM, Openstack and
Kubernetes.

- CEEMS Prometheus exporter is capable of exporting compute unit metrics including energy
consumption, performance, IO and network metrics from different resource managers in a
unified manner.
unified manner. In addition, CEEMS exporter is capable of providing targets to
[Grafana Alloy](https://grafana.com/docs/alloy/latest/reference/components/discovery/discovery.http/)
for continuously profiling compute units using
[eBPF](https://grafana.com/docs/alloy/latest/reference/components/pyroscope/pyroscope.ebpf/)

- CEEMS API server can store the aggregate metrics and metadata of each compute unit
originating from different resource managers.

- CEEMS load balancer provides basic access control on TSDB so that compute unit metrics
- CEEMS load balancer provides basic access control on TSDB and Pyroscope so that compute unit metrics
from different projects/tenants/namespaces are isolated.

"Compute Unit" in the current context has a wider scope. It can be a batch job in HPC,
a VM in cloud, a pod in k8s, _etc_. The main objective of the stack is to quantify
a VM in cloud, a pod in k8s, *etc*. The main objective of the stack is to quantify
the energy consumed and estimate emissions by each "compute unit". The repository itself
does not provide any frontend apps to show dashboards and it is meant to use along
with Grafana and Prometheus to show statistics to users.
Expand Down
2 changes: 1 addition & 1 deletion website/docs/02-objectives.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Objectives

The objectives of the current stack are two-fold:
The objectives of the current stack are several folds:

- For end users to be able to monitor their compute units in real time. Besides the
conventional metrics like CPU usage, memory usage, _etc_, the stack also exposes
Expand Down
44 changes: 26 additions & 18 deletions website/docs/components/ceems-lb.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,13 +6,13 @@ sidebar_position: 3

## Background

The motivation behind creating CEEMS load balancer component is that Prometheus TSDB
do not enforce any sort of access control over its metrics querying. This means once
a user has been given the permissions to query a Prometheus TSDB, they can query _any_
metrics stored in the TSDB.
The motivation behind creating CEEMS load balancer component is that neither Prometheus TSDB
nor Grafana Pyroscope enforce any sort of access control over its metrics/profiles querying.
This means once a user has been given the permissions to query a Prometheus TSDB/Grafana
Pyroscope server, they can query _any_ metrics/profiles stored in the server.

Generally, it is not necessary to expose TSDB to end users directly and it is done
using Grafana as Prometheus datasource. Dashboards that are exposed to the end users
Generally, it is not necessary to expose TSDB/Pyroscope server to end users directly and it is done
using Grafana as Prometheus/Pyroscope datasource. Dashboards that are exposed to the end users
need to have query access on the underlying
datasource that the dashboard uses. Although a regular user with
[`Viewer`](https://grafana.com/docs/grafana/latest/administration/roles-and-permissions/access-control/#basic-roles)
Expand All @@ -23,28 +23,28 @@ This effectively means, the user can make _any_ query to the underlying datasour
Prometheus, using the browser cookie that is set by Grafana auth. The consequence is that
the user can query the metrics of _any_ user or _any_ compute unit. Straight forward
solutions to this problem is to create a Prometheus instance for each project/namespace.
However, this is not a scalable solution when they are thousands of projects/namespaces
exist.
However, this is not a scalable solution when there are thousands of projects/namespaces
in a given deployment.

This can pose few issues in multi tenant systems like HPC and cloud computing platforms.
Ideally, we do not want one user to be able to access the compute unit metrics of
other users. CEEMS load balancer component has been created to address this issue.

CEEMS Load Balancer addresses this issue by acting as a gate keeper to introspect the
query before deciding whether to proxy the request to TSDB or not. It means when a user
makes a TSDB query for a given compute unit, CEEMS load balancer will check if the user
query before deciding whether to proxy the request to TSDB/Pyroscope or not. It means when a user
makes a TSDB/Pyroscope query for a given compute unit, CEEMS load balancer will check if the user
owns that compute unit by verifying with CEEMS API server.

## Objectives

The main objectives of the CEEMS load balancer are two-fold:

- To provide access control on the TSDB so that compute units of each project/namespace
- To provide access control on the TSDB/Pyroscope so that compute units of each project/namespace
are only accessible to the members of that project/namespace
- To provide basic load balancing for replicated TSDB instances.
- To provide basic load balancing for replicated TSDB/Pyroscope instances.

Thus, CEEMS load balancer can be configured as Prometheus data source in Grafana and
the load balancer will take care of routing traffic to backend TSDB instances and at
Thus, CEEMS load balancer can be configured as Prometheus and Pyroscope data sources in Grafana and
the load balancer will take care of routing traffic to backend TSDB/Pyroscope instances and at
the same time enforcing access control.

## Load balancing
Expand All @@ -53,6 +53,14 @@ CEEMS load balancer supports classic load balancing strategies like round-robin
connection methods. Besides these two, it supports resource based strategy that is
based on retention time. Let's take a look at this strategy in-detail.

:::warning[WARNING]

Resource based load balancing strategy is only supported for TSDB. For Pyroscope,
this strategy is not supported and when used, it will be defaulted to least-connection
strategy.

:::

Taking Prometheus TSDB as an example, Prometheus advises to use local file system to store
the data. This ensure performance and data integrity. However, storing data on local
disk is not fault tolerant unless data is replicated elsewhere. There are cloud native
Expand All @@ -77,12 +85,12 @@ then routing the request to either "hot" or "cold" instances of TSDB.
## Multi cluster support

A single deployment of CEEMS load balancer is capable of loading balancing traffic between
different replicated TSDB instances of multiple clusters. Imagine there are two different
different replicated TSDB/Pyroscope instances of multiple clusters. Imagine there are two different
clusters, one for SLURM and one for Openstack, in a DC. Slurm cluster has two dedicated
TSDB instances where data is replicated between them and the same for Openstack cluster.
Thus, in total, there are four TSDB instances, two for SLURM cluster and two for
TSDB/Pyroscope instances where data is replicated between them and the same for Openstack cluster.
Thus, in total, there are four TSDB/Pyroscope instances, two for SLURM cluster and two for
Openstack cluster. A single instance of CEEMS load balancer can route the traffic
between these four different TSDB instances by targeting the correct cluster.
between these four different TSDB/Pyroscope instances by targeting the correct cluster.

However, in the production with heavy traffic a single instance of CEEMS load balancer
might not be a optimal solution. In that case, it is however possible to deploy a dedicated
Expand Down
40 changes: 36 additions & 4 deletions website/docs/components/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,38 @@ sidebar_position: 4

# CEEMS Exporter Metrics

CEEMS exporter ships multiple collectors of which some are enabled by
default.

## Enabled by default

The following collectors are enabled by default

- cpu
- meminfo
- rapl

## Disabled by default

The rest of the collectors and sub-collectors are disabled by default. Collectors
disabled by default are:

- ipmi_dcmi
- emissions
- slurm
- libvirt

Sub-collectors disabled by default are:

- ebpf.io-metrics
- ebpf.network-metrics
- perf.hardware-events
- perf.software-events
- perf.hardware-cache-events
- rdma.stats

## Metrics list

The following are the list of metrics exposed by CEEMS exporter along
with the labels for each metric and its description. The first column
shows the collector that metric belongs to.
Expand All @@ -16,10 +48,10 @@ shows the collector that metric belongs to.
| meminfo | ceems_meminfo_MemTotal_bytes | hostname | Total memory in the current host. As reported in `/proc/meminfo` |
| meminfo | ceems_meminfo_MemFree_bytes | hostname | Total free memory in the current host. As reported in `/proc/meminfo` |
| meminfo | ceems_meminfo_MemAvailable_bytes | hostname | Total available memory in the current host. As reported in `/proc/meminfo` |
| ipmi | ceems_ipmi_dcmi_current_watts | hostname | Current power consumption reported by IPMI DCMI |
| ipmi | ceems_ipmi_dcmi_avg_watts | hostname | Average power consumption reported by IPMI DCMI within sampling period |
| ipmi | ceems_ipmi_dcmi_min_watts | hostname | Minimum power consumption reported by IPMI DCMI within sampling period |
| ipmi | ceems_ipmi_dcmi_max_watts | hostname | Maximum power consumption reported by IPMI DCMI within sampling period |
| ipmi_dcmi | ceems_ipmi_dcmi_current_watts | hostname | Current power consumption reported by IPMI DCMI |
| ipmi_dcmi | ceems_ipmi_dcmi_avg_watts | hostname | Average power consumption reported by IPMI DCMI within sampling period |
| ipmi_dcmi | ceems_ipmi_dcmi_min_watts | hostname | Minimum power consumption reported by IPMI DCMI within sampling period |
| ipmi_dcmi | ceems_ipmi_dcmi_max_watts | hostname | Maximum power consumption reported by IPMI DCMI within sampling period |
| rapl | ceems_rapl_package_joules_total | path, index | Current RAPL package energy value. Labels `index` and `path` gives info about package details. |
| rapl | ceems_rapl_dram_joules_total | path, index | Current RAPL DRAM energy value. Labels `index` and `path` gives info about package details. |
| rapl | ceems_rapl_core_joules_total | path, index | Current RAPL core energy value. Labels `index` and `path` gives info about package details.
Expand Down
84 changes: 77 additions & 7 deletions website/docs/configuration/ceems-lb.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,13 @@ sidebar_position: 4

# CEEMS Load Balancer

CEEMS load balancer supports providing load balancer for TSDB and Pyroscope
servers. When both TSDB and Pyroscope backend servers are configured, CEEMS LB
will launch two different web servers listening at two different ports one
for TSDB and one for Pyroscope.

## CEEMS Load Balancer Configuration

CEEMS Load Balancer configuration has one main section and two optional
section. A basic skeleton of the configuration is as follows:

Expand All @@ -28,22 +35,25 @@ A valid sample
configuration file can be found in the
[repo](https://github.com/mahendrapaipuri/ceems/blob/main/build/config/ceems_lb/ceems_lb.yml).

## CEEMS Load Balancer Configuration

A sample CEEMS LB config file is shown below:

```yaml
ceems_lb:
strategy: resource-based
backends:
- id: slurm-0
tsdb_urls:
- http://localhost:9090
pyroscope_urls:
- http://localhost:4040
- id: slurm-1
tsdb_urls:
- http://localhost:9090
- id: slurm-2
pyroscope_urls:
- http://localhost:4040
```

- `strategy`: Load balancing strategy. Besides classical `round-robin` and
Expand All @@ -55,16 +65,44 @@ that has the data based on the time period in the query.
that the `id` in the backend must be the same `id` used in the
[Clusters Configuration](./ceems-api-server.md#clusters-configuration). This
is how CEEMS LB will know which cluster to target.
- `backends.tsdb_urls`: A list of TSDB servers that scrape metrics from this
- `backends.tsdb_urls`: A list of TSDB servers that scrape metrics from the
cluster identified by `id`.
- `backends.pyroscope_urls`: A list of Pyroscope servers that store profiling data from the
cluster identified by `id`.

:::warning[WARNING]

`resource-based` strategy is only supported for TSDB and when used along with
Pyroscope, the load balancing strategy for Pyroscope servers will be defaulted
to `least-connection`.

CEEMS LB is meant to deploy in the same DMZ as the TSDB servers and hence, it
does not support TLS for the backends.

:::

### CEEMS Load Balancer CLI configuration

By default CEEMS LB servers listen at ports `9030` and `9040` when both
TSDB and Pyroscope backend servers are configured. If intended to use
custom ports, the CLI flag `--web.listen-address` must be repeated to set up
port for TSDB and Pyroscope backends. For instance, for the sample config shown
above, the CLI arguments to launch LB servers at custom ports will be:

```bash
ceems_lb --config.file config.yml --web.listen-address ":8000" --web.listen-address ":9000"
```

This will launch TSDB load balancer listening at port `8000` and Pyroscope load
balancer listening at port `9000`.

:::important[IMPORTANT]

When both TSDB and Pyroscope backend servers are configured, the first listen
address is attributed to TSDB and second one to Pyroscope.

:::

### Matching `backends.id` with `clusters.id`

#### Using custom header
Expand Down Expand Up @@ -148,7 +186,7 @@ For instance, for `slurm-0` cluster the provisioned datasource
config for Grafana will look as follows:

```yaml
- name: CEEMS-LB
- name: CEEMS-TSDB-LB
type: prometheus
access: proxy
url: http://localhost:9030
Expand All @@ -164,10 +202,25 @@ config for Grafana will look as follows:
secureJsonData:
basicAuthPassword: <ceems_lb_basic_auth_password>
httpHeaderValue1: slurm-0
isDefault: true
```

assuming CEEMS LB is running at port 9030 on the same host as Grafana.
assuming CEEMS LB is running at port 9030 on the same host as Grafana. Similarly,
for Pyroscope the provisioned config must look like:

```yaml
- name: CEEMS-Pyro-LB
type: pyroscope
access: proxy
url: http://localhost:9040
basicAuth: true
basicAuthUser: ceems
jsonData:
httpHeaderName1: X-Ceems-Cluster-Id
secureJsonData:
basicAuthPassword: <ceems_lb_basic_auth_password>
httpHeaderValue1: slurm-0
```

Notice that we set the header and value in `jsonData` and `secureJsonData`,
respectively. This ensures that datasource will send the header with
every request to CEEMS LB and then LB will redirect the query request
Expand Down Expand Up @@ -201,6 +254,23 @@ CEEMS LB, the query label will take the precedence.

:::

Similarly for setting up this label on profiling data in Pyroscope,
it is necessary to use `external_labels` config parameter for Grafana
Alloy when exporting profiles to Pyroscope server. A sample config
for Grafana Alloy that pushes profiling data can be as follows:

```river
pyroscope.write "monitoring" {
endpoint {
url = "http://pyroscope:4040"
}
external_labels = {
"ceems_id" = "slurm-0",
}
}
```

## CEEMS API Server Configuration

This is an optional config when provided will enforce access
Expand Down
Loading

0 comments on commit 87ee3af

Please sign in to comment.