We maintain multiple flavors of infrastructure with various degrees of management.
GCP infrastructure is configured via terraform in the infrastructure repository. All configuration for security projects should be stored in the security subdirectory. Please adhere to this terraform style guide when working here.
For instructions on how to deploy this infrastructure, see GCP Deployment Playbooks.
To deploy logging infrastructure, see the Deploying logging infrastructure playbook.
Logging configuration exists in many different places at present, which makes it complex.
-
pubsub.tf
in the cloud and dogfood directories of the infrastrucutre repository.-
This should remain mostly static, but the filter may change as filtering rules are refined, and additional logging sinks may be added for the staging environment.
-
Note that not all cloud pubsub configuration belongs to security.
-
This creates a logging sink for cloud and dogfood which sends logs via pub/sub to the security project.
-
-
gke-logging.tf
in the cloud and dogfood directories of the infrastrucutre repository.- This should remain static.
- This deploys the gke-logging module to the k8s cluster.
-
The
gke-logging
module in the modules folder.- This should remain static.
- This module pushes GKE node audit logs to stackdriver.
- Reference configuration
-
The logging folder in the security project.
- This contains the GCP configuration for the logging projects owned by security.
-
The helm directory in the security project.
- This contains the configuration for all helm deployments for security.
- Currently, this is only pubsubbeats.
-
Elastic cloud's production logging deployment
- Elastic manages our logs, as well as our retention policy on our log data.
- Expected to be re-configured whenever new sources of logs are added, as well as monitored to ensure it doesn't run out of disk space.
To implement in #17281.
Will likely be similar to the above logging infrastructure.
This section explains how to use various tools used to generate and deploy kubernetes configuration to GKE.
Instead of using helm, we use helmfile. The reasoning for doing this is that helmfile allows basic script execution as part of the templating process, which is used to decrypt the secrets used for pubsubbeats. Additionally, it supports conditional configuration based on the deployment environment, which makes it harder to accidentally desynchronize the staging and production configurations.
Kubectl is used to interact with kubernetes clusters. For basic information see the existing kubectl tips and tricks document. See the linked documents for examples of how kubectl is used to configure or debug.
These are security's current GCP projects, and what they do.
For instructions on how to deploy these projects, see GKE Deployment Playbooks.
Currently ingests all stackdriver logs from the projects sourcegraph-dev
(cloud) and sourcegraph-dogfood
(dogfood). Will later ingest logs from other sources using additional deployments within the cluster.
To implement in #17281.
This is a testbed to allow us to test changes to logs without risking production logs. This pushes logs to the stage logging environment, so that they don't pollute production logs in elastic.
Currently unused. Will eventually contain a HashiCorp Vault instance for secret management. This may change depending on the state of Managed Vault. We may transition to using a managed vault service.
Unmaintained and to be deleted - purely used as a testbed for vault. Do not add production secrets to this instance.
We currently use elastic cloud to store centralized security logs. This allows us to avoid the overhead of managing it ourselves, while getting something that's reasonably performant and stable.
Elastic cloud web portal is here. Credentials are stored in 1Password.
Currently contains all stackdriver logs from the GCP projects sourcegraph-dogfood
and sourcegraph-dev
. Note that stackdriver also contains OS audit logs from GKE nodes on the primary GKE clusters for those projects. This is due to the afforementioned gke-logging
module being deployed in them as part of our logging infrastructure.
Note that the pubsubbeat index lifecycle policy is set to a maximum index size is 50GB, and rollover is enabled.
Note that the index refresh interval is 30 seconds.
To implement in #17281.
This section is a placeholder, since we may or may not use this service.