2.0 Docs Compendium (#870)

Signed-off-by: Michael Dresser <[email protected]> Co-authored-by: Steven Weber <[email protected]> Co-authored-by: Mike Murphy <[email protected]> Co-authored-by: Mike Murphy <mike@kubecost> Co-authored-by: Jason Charcalla <[email protected]> Co-authored-by: Michael Dresser <[email protected]> Co-authored-by: Thomas Nguyen <[email protected]> Co-authored-by: thomasvn <[email protected]> Co-authored-by: jesse goodier <[email protected]>
kubecost · Jan 30, 2024 · 75bf70f · 75bf70f
1 parent c91c945
commit 75bf70f
Show file tree

Hide file tree

Showing 46 changed files with 821 additions and 305 deletions.
diff --git a/SUMMARY.md b/SUMMARY.md
@@ -25,6 +25,7 @@
   * [Multi-Cluster](install-and-configure/install/multi-cluster/multi-cluster.md)
     * [ETL Federation (preferred)](install-and-configure/install/multi-cluster/federated-etl/federated-etl.md)
       * [Kubecost Aggregator](install-and-configure/install/multi-cluster/federated-etl/aggregator.md)
+      * [Migration Guide from Thanos to Kubecost 2.0 (Aggregator)](install-and-configure/install/multi-cluster/federated-etl/thanos-migration-guide.md)
       * [Backups and Alerting](install-and-configure/install/multi-cluster/federated-etl/federated-etl-backups-alerting.md)
     * [Thanos Federation](install-and-configure/install/multi-cluster/thanos-setup/thanos-setup.md)
       * [Configuring Thanos](install-and-configure/install/multi-cluster/thanos-setup/configuring-thanos.md)
@@ -34,8 +35,9 @@
       * [AWS Thanos IAM Policy](install-and-configure/install/multi-cluster/long-term-storage-configuration/aws-service-account-thanos.md)
       * [Azure Long-Term Storage](install-and-configure/install/multi-cluster/long-term-storage-configuration/long-term-storage-azure.md)
       * [GCP Long-Term Storage](install-and-configure/install/multi-cluster/long-term-storage-configuration/long-term-storage-gcp.md)
-   * [Mulit-Cluster Diagnostics](install-and-configure/install/multi-cluster/multi-cluster-diagnostics.md)
+    * [Multi-Cluster Diagnostics](install-and-configure/install/multi-cluster/multi-cluster-diagnostics.md)
     * [Secondary Clusters Guide](install-and-configure/install/multi-cluster/secondary-clusters.md)
+  * [Kubecost 2.0 Install/Upgrade](install-and-configure/install/kubecostv2.md)
   * [ETL Backup](install-and-configure/install/etl-backup/etl-backup.md)
     * [Sharing ETL Backups](install-and-configure/install/etl-backup/sharing-etl-backups.md)
     * [Query Service Replicas](install-and-configure/install/etl-backup/query-service-replicas.md)
@@ -89,6 +91,8 @@
   * [Assets Dashboard](using-kubecost/navigating-the-kubecost-ui/assets.md)
   * [Clusters Dashboard](using-kubecost/navigating-the-kubecost-ui/clusters-dashboard.md)
   * [Cloud Cost Explorer](using-kubecost/navigating-the-kubecost-ui/cloud-costs-explorer.md)
+  * [Network Monitoring](using-kubecost/navigating-the-kubecost-ui/network-monitoring.md)
+  * [Collections](using-kubecost/navigating-the-kubecost-ui/collections.md)
   * [Reports](using-kubecost/navigating-the-kubecost-ui/saved-reports/reports.md)
     * [Advanced Reporting](using-kubecost/navigating-the-kubecost-ui/saved-reports/advanced-reports.md)
     * [Cost Center Report](using-kubecost/navigating-the-kubecost-ui/saved-reports/cost-center-report.md)
@@ -108,6 +112,8 @@
   * [Cluster Health Score](using-kubecost/navigating-the-kubecost-ui/cluster-health-score.md)
   * [Budgets](using-kubecost/navigating-the-kubecost-ui/budgets.md)
   * [Audits](using-kubecost/navigating-the-kubecost-ui/audits.md)
+  * [Anomaly Detection](using-kubecost/navigating-the-kubecost-ui/anomaly-detection.md)
+  * [Teams](using-kubecost/navigating-the-kubecost-ui/teams.md)
 * [Contexts](using-kubecost/context-switcher.md)
 * [Kubecost Data Audit](using-kubecost/kubecost-data-audit/README.md)
   * [AWS/Kubecost Data Audit](using-kubecost/kubecost-data-audit/aws-kubecost-data-audit.md)

diff --git a/images/aggregator/aggregator-diagram.png b/images/aggregator/aggregator-diagram.png
diff --git a/images/anomalydetection.png b/images/anomalydetection.png
diff --git a/images/collections.png b/images/collections.png
diff --git a/images/crss.png b/images/crss.png
diff --git a/images/forecasting.png b/images/forecasting.png
diff --git a/images/leader-follower.png b/images/leader-follower.png
diff --git a/images/networkmonitoring.png b/images/networkmonitoring.png
diff --git a/images/networkmonitoring2.png b/images/networkmonitoring2.png
diff --git a/images/newcollection.png b/images/newcollection.png
diff --git a/install-and-configure/advanced-configuration/high-availability.md b/install-and-configure/advanced-configuration/high-availability.md
@@ -1,59 +1,61 @@
-# High Availability Kubecost
-
-{% hint style="info" %}
-High availability mode is only officially supported on Kubecost Enterprise plans.
-{% endhint %}
-
-Running Kubecost in high availability (HA) mode is a feature that relies on multiple Kubecost replica pods implementing the [ETL Bucket Backup](/install-and-configure/install/etl-backup/etl-backup.md) feature combined with a Leader/Follower implementation which ensures that there always exists exactly one leader across all replicas.
-
-## Leader + Follower
-
-The Leader/Follower implementation leverages a `coordination.k8s.io/v1` `Lease` resource to manage the election of a leader when necessary. To control access of the backup from the ETL pipelines, a `RWStorageController` is implemented to ensure the following:
-
-* Followers block on all backup reads, and poll bucket storage for any backup reads every 30 seconds.
-* Followers no-op on any backup writes.
-* Followers who receive Queries in a backup store will not stack on pending reads, preventing external queries from blocking.
-* Followers promoted to Leader will drop all locks and receive write privileges.
-* Leaders behave identically to a single Kubecost install.
-
-![Leader/Follower](/images/leader-follower.png)
-
-## Configuring high availability
-
-In order to enable the leader/follower and HA features, the following must also be configured:
-
-* Replicas are set to a value greater than 1
-* ETL FileStore is Enabled (enabled by default)
-* [ETL Bucket Backup](/install-and-configure/install/etl-backup/etl-backup.md) is configured
-
-For example, using our Helm chart, the following is an acceptable configuration:
-
-```bash
-helm install kubecost kubecost/cost-analyzer --namespace kubecost \
-	--set kubecostDeployment.leaderFollower.enabled=true \ 
-	--set kubecostDeployment.replicas=5 \
-	--set kubecostModel.etlBucketConfigSecret=kubecost-bucket-secret
-```
-
-This can also be done in the `values.yaml` file within the chart:
-
-```yaml
-kubecostModel:
-  image: "gcr.io/kubecost1/cost-model"
-  imagePullPolicy: Always
-  # ... 
-  # ETL should be enabled with etlFileStoreEnabled: true 
-  etl: true
-  etlFileStoreEnabled: true 
-  # ...
-  # ETL Bucket Backup should be configured by passing the configuration secret name
-  etlBucketConfigSecret: kubecost-bucket-secret
-
-# Used for HA mode in Enterprise tier
-kubecostDeployment:
-  # Select a number of replicas of Kubecost pods to run 
-  replicas: 5
-  # Enable Leader/Follower Election 
-  leaderFollower:
-    enabled: true
-```
+# High Availability Kubecost
+
+{% hint style="warning" %}
+High availability mode is no longer supported as of Kubecost 2.0.
+{% endhint %}
+
+{% hint style="info" %}
+High availability mode is only officially supported on Kubecost Enterprise plans.
+{% endhint %}
+
+Running Kubecost in high availability (HA) mode is a feature that relies on multiple Kubecost replica pods implementing the [ETL Bucket Backup](/install-and-configure/install/etl-backup/etl-backup.md) feature combined with a Leader/Follower implementation which ensures that there always exists exactly one leader across all replicas.
+
+## Leader + Follower
+
+The Leader/Follower implementation leverages a `coordination.k8s.io/v1` `Lease` resource to manage the election of a leader when necessary. To control access of the backup from the ETL pipelines, a `RWStorageController` is implemented to ensure the following:
+
+* Followers block on all backup reads, and poll bucket storage for any backup reads every 30 seconds.
+* Followers no-op on any backup writes.
+* Followers who receive Queries in a backup store will not stack on pending reads, preventing external queries from blocking.
+* Followers promoted to Leader will drop all locks and receive write privileges.
+* Leaders behave identically to a single Kubecost install.
+
+## Configuring high availability
+
+In order to enable the leader/follower and HA features, the following must also be configured:
+
+* Replicas are set to a value greater than 1
+* ETL FileStore is Enabled (enabled by default)
+* [ETL Bucket Backup](/install-and-configure/install/etl-backup/etl-backup.md) is configured
+
+For example, using our Helm chart, the following is an acceptable configuration:
+
+```bash
+helm install kubecost kubecost/cost-analyzer --namespace kubecost \
+	--set kubecostDeployment.leaderFollower.enabled=true \ 
+	--set kubecostDeployment.replicas=5 \
+	--set kubecostModel.etlBucketConfigSecret=kubecost-bucket-secret
+```
+
+This can also be done in the `values.yaml` file within the chart:
+
+```yaml
+kubecostModel:
+  image: "gcr.io/kubecost1/cost-model"
+  imagePullPolicy: Always
+  # ... 
+  # ETL should be enabled with etlFileStoreEnabled: true 
+  etl: true
+  etlFileStoreEnabled: true 
+  # ...
+  # ETL Bucket Backup should be configured by passing the configuration secret name
+  etlBucketConfigSecret: kubecost-bucket-secret
+
+# Used for HA mode in Enterprise tier
+kubecostDeployment:
+  # Select a number of replicas of Kubecost pods to run 
+  replicas: 5
+  # Enable Leader/Follower Election 
+  leaderFollower:
+    enabled: true
+```
diff --git a/install-and-configure/advanced-configuration/resource-consumption.md b/install-and-configure/advanced-configuration/resource-consumption.md
@@ -40,12 +40,12 @@ Lowering query resolution will reduce memory consumption but will cause short ru
 
 Fewer data points scraped from Prometheus means less data to collect and store, at the cost of Kubecost making estimations that possibly miss spikes of usage or short running pods. The default value is: `60s`. This can be tuned in our [Helm values](https://github.com/kubecost/cost-analyzer-helm-chart/blob/v1.93.2/cost-analyzer/values.yaml#L389) for the Prometheus scrape job.
 
-## Disable or stop scraping node exporter
+## Keep node exporter disabled
 
-Node-exporter is optional. Some health alerts will be disabled if node-exporter is disabled, but savings recommendations and core cost allocation will function normally. This can be disabled with the following [Helm values](https://github.com/kubecost/cost-analyzer-helm-chart/blob/v1.93.2/cost-analyzer/values.yaml#L442):
+Node-exporter is disabled by default, and is an optional feature. Some health alerts will be disabled if node-exporter is disabled, but savings recommendations and core cost allocation will function normally. You can enable node-exporter with the following [Helm values](https://github.com/kubecost/cost-analyzer-helm-chart/blob/v1.93.2/cost-analyzer/values.yaml#L442):
 
-* `--set prometheus.server.nodeExporter.enabled=false`
-* `--set prometheus.serviceAccounts.nodeExporter.create=false`
+* `--set prometheus.server.nodeExporter.enabled=true`
+* `--set prometheus.serviceAccounts.nodeExporter.create=true`
 
 ## Soft memory limit field
 

diff --git a/...d-configure/advanced-configuration/user-management-oidc/user-management-oidc.md b/...d-configure/advanced-configuration/user-management-oidc/user-management-oidc.md
@@ -116,7 +116,15 @@ Use [your browser's devtools](https://developer.chrome.com/docs/devtools/network
 
 ### Option 2: Review logs, and decode your JWT tokens
 
+If `kubecostAggregator.enabled` is `true` or unspecified in `values.yaml`:
 ```sh
+kubectl logs statefulsets/kubecost-aggregator
+kubectl logs deploy/kubecost-cost-analyzer
+```
+
+If `kubecostAggregator.enabled` is `false` in `values.yaml`:
+```sh
+kubectl logs services/kubecost-aggregator
 kubectl logs deploy/kubecost-cost-analyzer
 ```
 
@@ -133,6 +141,10 @@ kubecostModel:
   extraEnv:
     - name: LOG_LEVEL
       value: debug
+kubecostAggregator:
+  extraEnv:
+    - name: LOG_LEVEL
+      value: debug
 ```
 
 For further assistance, reach out to [email protected] and provide both logs and a [HAR file](https://support.google.com/admanager/answer/10358597?hl=en).
diff --git a/install-and-configure/advanced-configuration/user-management-saml/README.md b/install-and-configure/advanced-configuration/user-management-saml/README.md
@@ -57,7 +57,9 @@ All SAML 2.0 providers also work. The above guides can be used as templates for
 
 ## Using the Kubecost API
 
-When SAML SSO is enabled in Kubecost, ports 9090 and 9003 of `service/kubecost-cost-analyzer` will require authentication. Therefore user API requests will need to be authenticated with a token. The token can be obtained by logging into the Kubecost UI and copying the token from the browser’s local storage. Alternatively, a long-term token can be issued to users from your identity provider.
+When SAML SSO is enabled in Kubecost, the following ports will require authentication:
+- `service/kubecost-cost-analzyer`: ports 9003 and 9090
+- `statefulset/kubecost-aggregator` or `service/kubecost-aggregator`: port 9004
 
 {% code overflow="wrap" %}
 ```sh
@@ -66,11 +68,18 @@ curl -L 'http://kubecost.mycompany.com/model/allocation?window=1d' \
 ```
 {% endcode %}
 
-For admins, Kubecost additionally exposes an unauthenticated API on port 9004 of `service/kubecost-cost-analyzer`.
+For admins, Kubecost additionally exposes unauthenticated APIs:
 
+`service/kubecost-cost-analyzer`: port 9007
 ```sh
-kubectl port-forward service/kubecost-cost-analyzer 9004:9004
-curl -L 'localhost:9004/allocation?window=1d'
+kubectl port-forward service/kubecost-cost-analyzer 9007:9007
+curl -L 'localhost:9007/allocation?window=1d'
+```
+
+`service/kubecost-aggregator`: port 9008
+```sh
+kubectl port-forward service/kubecost-aggregator 9008:9008
+curl -L 'localhost:9008/allocation?window=1d'
 ```
 
 ## View your SAML Group
@@ -79,12 +88,12 @@ You will be able to view your current SAML Group in the Kubecost UI by selecting
 
 ## SAML troubleshooting guide
 
-1. Disable SAML and confirm that the `cost-analyzer` pod starts.
-2.  If step 1 is successful, but the pod is crashing or never enters the ready state when SAML is added, it is likely that there is panic loading or parsing SAML data.
-
-    `kubectl logs deployment/kubecost-cost-analyzer -c cost-model -n kubecost`
+1. Disable SAML and confirm the `cost-analyzer` pod starts. If `kubecostAggregator.enabled` is unspecified or `true` in the _values.yaml_ file, confirm that the `aggregator` pod starts.
+2.  If Step 1 is successful, but the pod is crashing or never enters the ready state when SAML is added, it is likely there is panic when loading or parsing SAML data.
+    - If `kubecostAggregator.enabled` is `true` or unspecified in _values.yaml_, run `kubectl logs statefulsets/kubecost-aggregator` and `kubectl logs deploy/kubecost-cost-analyzer`
+    - If `kubecostAggregator.enabled` is `false` in _values.yaml_, run `kubectl logs services/kubecost-aggregator` and `kubectl logs deploy/kubecost-cost-analyzer`
 
-If you’re supplying the SAML from the address of an Identity Provider Server, `curl` the SAML metadata endpoint from within the Kubecost pod and ensure that a valid XML EntityDescriptor is being returned and downloaded. The response should be in this format:
+If you’re supplying the SAML from the address of an Identity Provider Server, `curl` the SAML metadata endpoint from within the `kubecost` pod and ensure that a valid XML EntityDescriptor is being returned and downloaded. The response should be in this format:
 
 {% code overflow="wrap" %}
 ```bash

diff --git a/...ration/user-management-saml/microsoft-entra-id-saml-integration-for-kubecost.md b/...ration/user-management-saml/microsoft-entra-id-saml-integration-for-kubecost.md
@@ -154,14 +154,24 @@ kubectl delete configmap -n kubecost group-filters && kubectl create configmap -
 
 ## Troubleshooting
 
-You can look at the logs on the cost-model container. This script is currently a work in progress.
+You can look at the logs on the aggregator and cost-model containers. This script is currently a work in progress.
+
+If `kubecostAggregator.enabled` is `true` or unspecified in _values.yaml_:
 
 {% code overflow="wrap" %}
 ```
 kubectl logs deployment/kubecost-cost-analyzer -c cost-model --follow |grep -v -E 'resourceGroup|prometheus-server'|grep -i -E 'group|xmlname|saml|login|audience'
 ```
 {% endcode %}
 
+If `kubecostAggregator.enabled` is `false` in _values.yaml_:
+
+{% code overflow="wrap" %}
+```
+kubectl logs services/kubecost-aggregator --follow |grep -v -E 'resourceGroup|prometheus-server'|grep -i -E 'group|xmlname|saml|login|audience'
+```
+{% endcode %}
+
 When the group has been matched, you will see:
 
 ```

diff --git a/...-configure/advanced-configuration/user-management-saml/okta-saml-integration.md b/...-configure/advanced-configuration/user-management-saml/okta-saml-integration.md
@@ -181,11 +181,32 @@ saml:
 
 ## Troubleshooting
 
-You can view the logs on the cost-model container. In this example, the assumption is that the prefix for Kubecost groups is `kubecost_`. This command is currently a work in progress.
-
+You can look at the logs on the aggregator and cost-model containers. In this example, the assumption is that the prefix for Kubecost groups is `kubecost_`. This script is currently a work in progress.
 
 `kubectl logs deployment/kubecost-cost-analyzer -c cost-model --follow |grep -v -E 'resourceGroup|prometheus-server'|grep -i -E 'group|xmlname|saml|login|audience|kubecost_'`
 
+{% code overflow="wrap" %}
+```
+kubectl logs deployment/kubecost-cost-analyzer -c cost-model --follow |grep -v -E 'resourceGroup|prometheus-server'|grep -i -E 'group|xmlname|saml|login|audience|kubecost_'
+```
+{% endcode %}
+
+If `kubecostAggregator.enabled` is `true` or unspecified in _values.yaml_:
+
+{% code overflow="wrap" %}
+```
+kubectl logs statefulsets/kubecost-aggregator --follow |grep -v -E 'resourceGroup|prometheus-server'|grep -i -E 'group|xmlname|saml|login|audience|kubecost_'
+```
+{% endcode %}
+
+If `kubecostAggregator.enabled` is `false` in _values.yaml_:
+
+{% code overflow="wrap" %}
+```
+kubectl logs services/kubecost-aggregator --follow |grep -v -E 'resourceGroup|prometheus-server'|grep -i -E 'group|xmlname|saml|login|audience|kubecost_'
+```
+{% endcode %}
+
 When the group has been matched, you will see:
 
 ```
@@ -216,4 +237,4 @@ I0330 14:48:20.702125       1 log.go:47] [Info] Attempting to authenticate saml.
 I0330 14:48:20.702229       1 costmodel.go:813] Authenticated saml
 ...
 I0330 14:48:21.011787       1 auth.go:167] AUDIENCE: [admin group:admin@kubecost.com]
-```
+```
diff --git a/...gure/install/cloud-integration/aws-cloud-integrations/aws-cloud-integrations.md b/...gure/install/cloud-integration/aws-cloud-integrations/aws-cloud-integrations.md
@@ -462,7 +462,9 @@ eksctl utils associate-iam-oidc-provider \
 
 **Step 4: Create required IAM service accounts**
 
-**Note:** Remember to replace `1234567890` with your AWS account ID number.
+{% hint style="info" %}
+Remember to replace `1234567890` with your AWS account ID number.
+{% endhint %}
 
 {% code overflow="wrap" %}
 ```