From 68176dd5f4e83f76cb24120995d3bc2337032004 Mon Sep 17 00:00:00 2001 From: Joe Chen Date: Fri, 19 Jan 2024 09:58:18 -0500 Subject: [PATCH] sams: initial page content --- .../core-services/managed-services/index.md | 2 + .../teams/core-services/sams/index.md | 112 ++++++++++++++++++ 2 files changed, 114 insertions(+) create mode 100644 content/departments/engineering/teams/core-services/sams/index.md diff --git a/content/departments/engineering/teams/core-services/managed-services/index.md b/content/departments/engineering/teams/core-services/managed-services/index.md index 8e76c9040a86..a89c66c4a8a5 100644 --- a/content/departments/engineering/teams/core-services/managed-services/index.md +++ b/content/departments/engineering/teams/core-services/managed-services/index.md @@ -13,6 +13,8 @@ Guidance for MSP incidents is available in [Managed Services incident response]( ## [Telemetry Gateway](./telemetry-gateway.md) +## [Sourcegraph Accounts Management System](../sams/index.md) + ## Sourcergaph.com Google OIDC The GCP project that manages our [Google OIDC authentication integration](https://console.cloud.google.com/apis/credentials/oauthclient/394401733494-3ekkk0qr3qvg7b3l1imqcgsh3ej710eq.apps.googleusercontent.com?project=sourcegraph-com-ggl-oidc) ("Sign in with Google") for Sourcergraph.com. We put the integration in a standalone project for a dedicated public-facing [OAuth consent screen](https://console.cloud.google.com/apis/credentials/consent?project=sourcegraph-com-ggl-oidc). diff --git a/content/departments/engineering/teams/core-services/sams/index.md b/content/departments/engineering/teams/core-services/sams/index.md new file mode 100644 index 000000000000..79c59e1e524c --- /dev/null +++ b/content/departments/engineering/teams/core-services/sams/index.md @@ -0,0 +1,112 @@ +# Sourcegraph Accounts Managment System (SAMS) + +[Sourcegraph Accounts Managment System (SAMS)](https://docs.google.com/document/d/16F6uvfM9EknpcuAQQ8kIPOZ9gHo0Lx4lgprw_5sWJEs/edit) is the centralized accounts system for all of the Sourcegraph-operated systems, it provides: + +- Single Sign-On (SSO) experience for users of those systems, and cross-system referenceable user ID. +- Out-of-the-box machine-to-machine authentication and authorization capabilities. + +It is compliant with [OAuth 2](https://oauth.net/2/) and [OIDC](https://openid.net/) protocols but only exposes a subset of the full capabilities for security reasons. In particular, only the following flows are allowed: + +- [Authorization Code Flow](https://auth0.com/docs/get-started/authentication-and-authorization-flow/authorization-code-flow) +- [Refresh Token Flow](https://cloudentity.com/developers/basics/oauth-grant-types/refresh-token-flow/) +- [Client Credentials Flow](https://auth0.com/docs/get-started/authentication-and-authorization-flow/client-credentials-flow) + +The [OpenID Discovery](https://accounts.sourcegraph.com/.well-known/openid-configuration) endpoint lays out all the protocol details that a Service Provider (aka. Relay Party) needs to know to integrate with SAMS. + +## Security measures + +Here is a non-exhaustive list of security measures that are notable to systems integrating with SAMS: + +1. Access tokens all have expiry with **1 hour**, refresh tokens are always issued together with access tokens. +1. Refresh tokens all have expiry with **30 days**, and each refresh token can only be used **at most once**. A new refresh token is always issued upon refreshing the access token. + +TODO: access token expiry + +## Service images + +Images are published to a private image repository, [`us-central1-docker.pkg.dev/sourcegraph-dev/sams/accounts-server`](https://console.cloud.google.com/artifacts/docker/sourcegraph-dev/us-central1/sams/accounts-server?project=sourcegraph-dev), on every commit in `main` using the `insiders` tag. To pull down the published images locally, you need to [request access via Entitle](https://app.entitle.io/request?data=eyJkdXJhdGlvbiI6IjEwODAwIiwianVzdGlmaWNhdGlvbiI6IlB1bGwgZG93biBkZXYgaW1hZ2VzIiwicm9sZUlkcyI6W3siaWQiOiJhM2ZmNTQ1ZC0zZGVmLTQxY2ItYjJiNy1lMTM2MDM5Y2YwZGYiLCJ0aHJvdWdoIjoiYTNmZjU0NWQtM2RlZi00MWNiLWIyYjctZTEzNjAzOWNmMGRmIiwidHlwZSI6InJvbGUifV19). + +Publishing resources are [provisioned in `sourcegraph/infrastructure`](https://github.com/sourcegraph/infrastructure/tree/main/managed-services/sams-publishing-pipeline). + +## Operations + +> [!NOTE] +> To get access to most resources, you’ll need to [request infrastructure access](https://app.entitle.io/request?targetType=resource&duration=43200&justification=TODO&integrationId=134476cb-0bd6-4c6d-a89f-e1550988bdd7&resourceId=d94da8c3-76eb-451a-9cbb-973ac3bc44b1&roleId=8b60a711-976c-4e56-9f8b-cb2c989faca4&grantMethodId=8b60a711-976c-4e56-9f8b-cb2c989faca4). + +Here is a list of useful quick links: + +- Prod instance (https://accounts.sgdev.org) + - [Terraform Cloud workspaces](https://app.terraform.io/app/sourcegraph/workspaces?project=prj-qWcQcoN16iA6rMfe) + - [Cloud Run (metrics overview)](https://console.cloud.google.com/run/detail/us-central1/sams/metrics?project=sams-prod-ywuz) + - [Cloud SQL (system insights)](https://console.cloud.google.com/sql/instances/postgresql-e03b/system-insights?project=sams-prod-ywuz) + - [Memorystore (monitoring)](https://console.cloud.google.com/memorystore/redis/locations/us-central1/instances/redis/details/monitoring?project=sams-prod-ywuz) + - [GCP alerts](https://console.cloud.google.com/monitoring/alerting?project=sams-prod-ywuz) + - [GCP errors](https://console.cloud.google.com/errors;service=;version=?project=sams-prod-ywuz) +- Testing instance (https://accounts.sourcegraph.com) + - [Terraform Cloud workspaces](https://app.terraform.io/app/sourcegraph/workspaces?project=prj-XWBtUm77JJRXddoZ) + - [Cloud Run (metrics overview)](https://console.cloud.google.com/run/detail/us-central1/sams/metrics?project=sams-dev-bfec) + - [Cloud SQL (system insights)](https://console.cloud.google.com/sql/instances/postgresql-e03b/system-insights?project=sams-dev-bfec) + - [Memorystore (monitoring)](https://console.cloud.google.com/memorystore/redis/locations/us-central1/instances/redis/details/monitoring?project=sams-dev-bfec) + - [GCP alerts](https://console.cloud.google.com/monitoring/alerting?project=sams-dev-bfec) + - [GCP errors](https://console.cloud.google.com/errors;service=;version=?project=sams-dev-bfec) + +For standard infrastructure operations, see [Sourcegraph Accounts infrastructure operations](../../../managed-services/sams.md). + +### Infrastructure access + +The following Entitle requests are needed to get access to SAMS service infrastructure: + +- [GCP Project - MSP Service Editor](https://app.entitle.io/request?targetType=resource&duration=43200&justification=TODO&integrationId=134476cb-0bd6-4c6d-a89f-e1550988bdd7&resourceId=d94da8c3-76eb-451a-9cbb-973ac3bc44b1&roleId=8b60a711-976c-4e56-9f8b-cb2c989faca4&grantMethodId=8b60a711-976c-4e56-9f8b-cb2c989faca4) (see [MSP Entitle](../managed-services/platform.md#entitle)) + +### Deployments + +The SAMS service infrastructure is defined in [`sourcegraph/managed-services/services/sams`](https://github.com/sourcegraph/managed-services/tree/main/services/sams) utilizing [Managed Services Platform](../managed-services/platform.md). + +#### Modify deployment manifest + +> [!WARNING] +> Due to the early-stage shape of [Managed Services Platform](../managed-services/platform.md), we have yet to roll out standardized playbook. Please reach out to #team-core-services for modifying the deployment manifest. Instructions in this section are generally assumed with an upfront setup. + +To modify the deployment manifest: + +1. Update `service.yaml` file +1. In the repository root, run `sg msp generate sams prod` +1. Stage changes and make a pull request +1. The Terraform Cloud rolls out changes + +#### Use a different image tag + +To specify a Docker image tag other than the default, update the `service.yaml`: + +```diff + - id: prod + ... + deploy: + type: manual ++ manual: ++ tag: insiders@sha256:3a7e1c0dd4e0d7e0c6d3e4d7b3a1 +``` + +#### Re-deploy the same manifest + +Go to the ["Deploy revision" page](https://console.cloud.google.com/run/deploy/us-central1/sams?project=sams-prod-ywuz) of the Cloud Run service and click **DEPLOY** (bottom of the page) without changing any configuration. This will also happen whenever a Terraform change happens to the "cloudrun" stack. + +### Observability + +> [!NOTE] +> To get access to most resources, you’ll need to [request infrastructure access](#infrastructure-access). + +#### Alerting + +Alerts are sent to Sentry and then forwarded to Slack: + +- #alerts-sams-dev for accounts.sgdev.org +- #alerts-sams-prod for accounts.sourcegraph.com + +#### Metrics + +The deployment's [Cloud Run metrics overview page](https://console.cloud.google.com/run/detail/us-central1/sams/metrics?project=sams-prod-ywuz) provides basic observability into the service provided out-of-the-box by Cloud Run, such as instance count and resource utilization. + +## Development + +The source code and CI are located in the [sourcegraph/sams](https://github.com/sourcegraph/sams) GitHub repository.