Skip to content
This repository has been archived by the owner on Jul 2, 2024. It is now read-only.

Commit

Permalink
Remove references to multi-instance dashboards (#8413)
Browse files Browse the repository at this point in the history
Remove references to centralized observability
  • Loading branch information
daxmc99 authored Jan 8, 2024
1 parent 6629fff commit f0e09af
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 54 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,6 @@ Sourcegraph engineers can disable SMTP by setting the `.spec.managedSMTP.disable

- Alerting: [frontend: email_delivery_failures](https://docs.sourcegraph.com/admin/observability/alerts#frontend-email-delivery-failures)
- Dashboards: [Frontend: Email delivery](https://docs.sourcegraph.com/admin/observability/dashboards#frontend-email-delivery)
- [Multi-instance dashboard](../observability/index.md#multi-instance-dashboard): [Frontend: Total emails successfully delivered every 5 minutes](https://monitoring.sgdev.org/d/multi-instance-overviews/multi-instance-overviews?orgId=1)

### Vendor-side

Expand Down
17 changes: 2 additions & 15 deletions content/departments/cloud/technical-docs/observability/index.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,18 @@
# Cloud Observability

Epic link: https://github.com/sourcegraph/customer/issues/1151

## Metrics

Metrics are gathered from all resources using the included Prometheus instance. This instance scrapes and stores the metrics locally as well as forwards them to the Managed Prometheus service provided by GCP.
Metrics are gathered from all resources using the included Prometheus instance. This instance scrapes and stores the metrics locally.

**Only metrics queried in our [monitoring generator](https://docs.sourcegraph.com/dev/background-information/observability/monitoring-generator) are forwarded - this allowlist is automatically generated.** If you'd like a new metric to be queryable in a centralized manner, you _must_ [create a dashboard panel](https://docs.sourcegraph.com/dev/how-to/add_monitoring#alerts-dashboards-and-documentation) for it.

These metrics are viewable through our centralised Grafana instance hosted at: https://monitoring.sgdev.org.

> [!NOTE] access to these resources must be granted. To request access, follow the [Requesting access to Grafana](./operations.md#requesting-access-to-grafana).
### Multi-instance dashboard

We generate a dashboard that renders panels that opt-in to a [multi-instance overviews dashboard](https://monitoring.sgdev.org/d/multi-instance-overviews/multi-instance-overviews).

Panels in this dashboard show the panel's query grouped by `project_id`, each of which represents a Cloud instance. The template variable dropdown at the top allow you to select instances to compare, which is persisted to the URL.

To opt-in a panel to this multi-instance dashboard, see [how to add monitoring](https://docs.sourcegraph.com/dev/how-to/add_monitoring#centralized-observability).
We no longer support multi-instance dashboards but the cloud-team is working on a replacement.

### Common operations

- Request access to the Grafana dashboard, follow [Requesting access to Grafana](./operations.md#requesting-access-to-grafana).
- To add a new dashboard to all managed instances, follow the [Creating a new individual dashboard](./operations.md#creating-a-new-individual-dashboard) procedure.
- To create a new aggregated dashboard that queries multiple cloud instances, follow the [Creating a new multi-instance dashboard](./operations.md#creating-a-new-multi-instance-dashboard) procedure.
- To manually refresh the dashboards on Grafana, follow the [Manually regenerate Grafana dashboards](./operations.md#manually-regenerate-grafana-dashboards) playbook.

## Tracing

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,49 +2,19 @@

## Requesting access to Grafana

Users who do not automatically have access to the Grafana instance can request access through [Entitle](https://entitle.io/). On Slack, type `/access_request` and hit enter. Fill out the form wil the following values:
![Entitle Request Form](https://storage.googleapis.com/sourcegraph-assets/handbook/engineering/cloud/entitle-iap-request.png)
<!-- Describe how to access grafana for a non-cloud teammate today directly to the instance -->

A Cloud team or Security team member will then need to approve the request. If you require permanent access to Grafana, please post a message in the [#cloud channel](https://sourcegraph.slack.com/archives/C03JR7S7KRP) on Slack and request a Cloud team member provision you access.
To access the grafana dashboard for a single cloud customer:

## Granting a user permanent access to Grafana
1. Find the customer on https://cloud-ops.sgdev.org/, go to the specific customer page
1. Goto "View monitoring dashboards" for the specific instance
1. When you attempt to access the dashboard with the given command you may receive about access
1. You should use Entitle to request access to the specific instance using this [form](https://app.entitle.io/request?data=eyJkdXJhdGlvbiI6IjM2MDAiLCJqdXN0aWZpY2F0aW9uIjoiQWNjZXNzIHRvIGNsb3VkIGluc3RhbmNlICQkSU5TRVJUIENMT1VEIElOU1RBTkNFIEhFUkUkJCQgZm9yIEdyYWZhbmEgZGFzaGJvYXJkIiwiYnVuZGxlSWRzIjpbImNlNTZlMGU2LTE1ZDYtNGYzYS05M2RmLWRkMjQxOGQzNzhlYyJdfQ%3D%3D)

User management is provisioned within GCP. To grant a new user permanent access to Grafana they will need to either be added to an approved group or have their identity specifically added to the IAP proxy.
## Multi-instance dashboard

To add a user, navigate to the [GCP Console IAP management page](https://console.cloud.google.com/security/iap?project=control-plane-5e9ee072) for Grafana. Click the check box and the provisioning page should appear on the right. From there, click "Add Principal" and add the user.

## Manually regenerate Grafana dashboards

Grafama dashboards are [generated when the `centralized-o11y` invariant](https://sourcegraph.sourcegraph.com/github.com/sourcegraph/controller/-/blob/internal/invariants/centralized_o11y.go) is run against an instance:

1. Cloud team members can run `mi2 instance check -e $ENVIRONMENT -s $SLUG -enforce centralized-o11y` locally. This will automatically generate an ID token, generate, and upload the dashboards to Grafana.
We not longer support multi-instance dashboards but the cloud-team is working on a replacement.

## Creating a new individual dashboard

The dashboards for Cloud customers are generated from the same [dashboard definitions](https://sourcegraph.com/github.com/sourcegraph/sourcegraph/-/tree/monitoring/definitions) that are create the bundled dashboards included with all Sourcegraph distributions. To create a new dashboard that will be rolled out to all managed instances, follow the [Developing Observability](https://docs.sourcegraph.com/dev/background-information/observability) guidelines.

## Creating a new multi-instance dashboard

We are now able to see the value of a query applied to multiple instances at once. To create a dashboard that queries multiple customers at once, log into Grafana and use the native creation tools. It's recommended to start with an existing dashboard panel, click the title, and selecct "Explore". This will allow you to modify the prewritten query. All Cloud instances support the same set of metrics and are tagged with additional metadata to denote the customer.

To view the results for a specific subset of customers, duplicate the query and filter each result for a given customer by changing the `project_id=` label selector.

If the `project_id` is unknown for a given customer, follow the [FAQ: How do I figure out the GCP Project ID for a customer?](../../index.md#faq-how-do-i-figure-out-the-gcp-project-id-for-a-customer) instructions.

> **NOTE**: Custom created dashboards _should_ persist through restarts however the Cloud team guarantees no SLAs. If a dashboard is mission-critical, please communicate with the Cloud team on getting it added as a permanent fixture. It's preferred that all dashboards are created in code and distributed as part of Sourcegraph itself.
Metrics that use Prometheus aggregation functions (like `sum by`) will need to be updated to include the `project_id` as a a grouping field, e.g.:

```
sum by (job) (pg_stat_activity{project_id="sourcegraph-managed-sg"})
```

would become

```
sum by (job, project_id) (pg_stag_activity)
```

to show the metric for all instances, labeled by their `project_id`.

These dashboards will be pregenerated [in the future](https://github.com/sourcegraph/customer/issues/1610).

0 comments on commit f0e09af

Please sign in to comment.