Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DMVP-3094: Adot log retention #92

Merged
merged 7 commits into from
Dec 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -235,7 +235,7 @@ worker_groups = {
| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_account_id"></a> [account\_id](#input\_account\_id) | AWS Account Id to apply changes into | `string` | `null` | no |
| <a name="input_adot_config"></a> [adot\_config](#input\_adot\_config) | n/a | `any` | <pre>{<br> "accept_namespace_regex": "(default|kube-system)",<br> "additional_metrics": {},<br> "log_group_name": "adot_log_group"<br>}</pre> | no |
| <a name="input_adot_config"></a> [adot\_config](#input\_adot\_config) | Adot configs | <pre>object({<br> accept_namespace_regex = optional(string, "(default|kube-system)")<br> additional_metrics = optional(list(string), [])<br> log_group_name = optional(string, "adot")<br> log_retention = optional(number, 14)<br> helm_values = optional(any, null)<br> })</pre> | <pre>{<br> "accept_namespace_regex": "(default|kube-system)",<br> "additional_metrics": [],<br> "log_group_name": "adot",<br> "log_retention": 14<br>}</pre> | no |
| <a name="input_adot_version"></a> [adot\_version](#input\_adot\_version) | The version of the AWS Distro for OpenTelemetry addon to use. | `string` | `"v0.78.0-eksbuild.1"` | no |
| <a name="input_alarms"></a> [alarms](#input\_alarms) | Alarms enabled by default you need set sns topic name for send alarms for customize alarms threshold use custom\_values | <pre>object({<br> enabled = optional(bool, true)<br> sns_topic = string<br> custom_values = optional(any, {})<br> })</pre> | n/a | yes |
| <a name="input_alb_log_bucket_name"></a> [alb\_log\_bucket\_name](#input\_alb\_log\_bucket\_name) | n/a | `string` | `""` | no |
Expand Down Expand Up @@ -269,10 +269,10 @@ worker_groups = {
| <a name="input_fluent_bit_configs"></a> [fluent\_bit\_configs](#input\_fluent\_bit\_configs) | Fluent Bit configs | <pre>object({<br> fluent_bit_name = optional(string, "")<br> log_group_name = optional(string, "")<br> system_log_group_name = optional(string, "")<br> log_retention_days = optional(number, 90)<br> values_yaml = optional(string, "")<br> configs = optional(object({<br> inputs = optional(string, "")<br> filters = optional(string, "")<br> outputs = optional(string, "")<br> }), {})<br> drop_namespaces = optional(list(string), [])<br> log_filters = optional(list(string), [])<br> additional_log_filters = optional(list(string), [])<br> kube_namespaces = optional(list(string), [])<br> })</pre> | <pre>{<br> "additional_log_filters": [<br> "ELB-HealthChecker",<br> "Amazon-Route53-Health-Check-Service"<br> ],<br> "configs": {<br> "filters": "",<br> "inputs": "",<br> "outputs": ""<br> },<br> "drop_namespaces": [<br> "kube-system",<br> "opentelemetry-operator-system",<br> "adot",<br> "cert-manager",<br> "opentelemetry.*",<br> "meta.*"<br> ],<br> "fluent_bit_name": "",<br> "kube_namespaces": [<br> "kube.*",<br> "meta.*",<br> "adot.*",<br> "devops.*",<br> "cert-manager.*",<br> "git.*",<br> "opentelemetry.*",<br> "stakater.*",<br> "renovate.*"<br> ],<br> "log_filters": [<br> "kube-probe",<br> "health",<br> "prometheus",<br> "liveness"<br> ],<br> "log_group_name": "",<br> "log_retention_days": 90,<br> "system_log_group_name": "",<br> "values_yaml": ""<br>}</pre> | no |
| <a name="input_manage_aws_auth"></a> [manage\_aws\_auth](#input\_manage\_aws\_auth) | n/a | `bool` | `true` | no |
| <a name="input_map_roles"></a> [map\_roles](#input\_map\_roles) | Additional IAM roles to add to the aws-auth configmap. | <pre>list(object({<br> rolearn = string<br> username = string<br> groups = list(string)<br> }))</pre> | `[]` | no |
| <a name="input_metrics_exporter"></a> [metrics\_exporter](#input\_metrics\_exporter) | Metrics Exporter, can use cloudwatch or adot | `string` | `"cloudwatch"` | no |
| <a name="input_metrics_exporter"></a> [metrics\_exporter](#input\_metrics\_exporter) | Metrics Exporter, can use cloudwatch or adot | `string` | `"adot"` | no |
| <a name="input_metrics_server_name"></a> [metrics\_server\_name](#input\_metrics\_server\_name) | n/a | `string` | `"metrics-server"` | no |
| <a name="input_node_groups"></a> [node\_groups](#input\_node\_groups) | Map of EKS managed node group definitions to create | `any` | <pre>{<br> "default": {<br> "desired_size": 2,<br> "iam_role_additional_policies": [<br> "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy"<br> ],<br> "instance_types": [<br> "t3.medium"<br> ],<br> "max_size": 4,<br> "min_size": 2<br> }<br>}</pre> | no |
| <a name="input_node_groups_default"></a> [node\_groups\_default](#input\_node\_groups\_default) | Map of EKS managed node group default configurations | `any` | <pre>{<br> "disk_size": 50,<br> "iam_role_additional_policies": [<br> "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy"<br> ],<br> "instance_types": [<br> "t3.medium"<br> ]<br>}</pre> | no |
| <a name="input_node_groups"></a> [node\_groups](#input\_node\_groups) | Map of EKS managed node group definitions to create | `any` | <pre>{<br> "default": {<br> "desired_size": 2,<br> "iam_role_additional_policies": [<br> "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy"<br> ],<br> "instance_types": [<br> "t3.large"<br> ],<br> "max_size": 4,<br> "min_size": 2<br> }<br>}</pre> | no |
| <a name="input_node_groups_default"></a> [node\_groups\_default](#input\_node\_groups\_default) | Map of EKS managed node group default configurations | `any` | <pre>{<br> "disk_size": 50,<br> "iam_role_additional_policies": [<br> "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy"<br> ],<br> "instance_types": [<br> "t3.large"<br> ]<br>}</pre> | no |
| <a name="input_node_security_group_additional_rules"></a> [node\_security\_group\_additional\_rules](#input\_node\_security\_group\_additional\_rules) | n/a | `any` | <pre>{<br> "ingress_cluster_10250": {<br> "description": "Metric server to node groups",<br> "from_port": 10250,<br> "protocol": "tcp",<br> "self": true,<br> "to_port": 10250,<br> "type": "ingress"<br> },<br> "ingress_cluster_8443": {<br> "description": "Metric server to node groups",<br> "from_port": 8443,<br> "protocol": "tcp",<br> "source_cluster_security_group": true,<br> "to_port": 8443,<br> "type": "ingress"<br> }<br>}</pre> | no |
| <a name="input_portainer_config"></a> [portainer\_config](#input\_portainer\_config) | Portainer hostname and ingress config. | <pre>object({<br> host = optional(string, "portainer.dasmeta.com")<br> enable_ingress = optional(bool, true)<br> })</pre> | `{}` | no |
| <a name="input_prometheus_metrics"></a> [prometheus\_metrics](#input\_prometheus\_metrics) | Prometheus Metrics | `any` | `[]` | no |
Expand Down
2 changes: 1 addition & 1 deletion examples/spot-instance/1-example.tf
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ module "cluster_min" {
}

node_groups_default = {
instance_types = ["t3.medium"]
instance_types = ["t3.large"]
capacity_type = "SPOT"
}

Expand Down
2 changes: 1 addition & 1 deletion examples/spot-instance/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@

| Name | Version |
|------|---------|
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 3.41 |
| <a name="provider_aws"></a> [aws](#provider\_aws) | 4.67.0 |

## Modules

Expand Down
2 changes: 2 additions & 0 deletions main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -310,6 +310,8 @@ resource "helm_release" "cert-manager" {
chart = "cert-manager"
repository = "https://charts.jetstack.io"
atomic = true
version = "v1.13.1"

set {
name = "installCRDs"
value = "true"
Expand Down
5 changes: 3 additions & 2 deletions modules/adot/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,7 @@ No modules.
| Name | Type |
|------|------|
| [aws_eks_addon.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/eks_addon) | resource |
| [aws_iam_policy.adot](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_policy) | resource |
| [aws_iam_role.adot_collector](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role) | resource |
| [aws_iam_role_policy_attachment.CloudWatchAgentServerPolicy](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/iam_role_policy_attachment) | resource |
| [helm_release.adot-collector](https://registry.terraform.io/providers/hashicorp/helm/latest/docs/resources/release) | resource |
Expand All @@ -73,8 +74,8 @@ No modules.

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_adot_collector_policy_arns"></a> [adot\_collector\_policy\_arns](#input\_adot\_collector\_policy\_arns) | List of IAM policy ARNs to attach to the ADOT collector service account. | `list(string)` | <pre>[<br> "arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess",<br> "arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy",<br> "arn:aws:iam::aws:policy/AWSXrayWriteOnlyAccess"<br>]</pre> | no |
| <a name="input_adot_config"></a> [adot\_config](#input\_adot\_config) | accept\_namespace\_regex defines the list of namespaces from which metrics will be exported, and additional\_metrics defines additional metrics to export. | `any` | <pre>{<br> "accept_namespace_regex": "(default|kube-system)",<br> "additional_metrics": [],<br> "helm_values": null,<br> "log_group_name": "adot_log_group"<br>}</pre> | no |
| <a name="input_adot_collector_policy_arns"></a> [adot\_collector\_policy\_arns](#input\_adot\_collector\_policy\_arns) | List of IAM policy ARNs to attach to the ADOT collector service account. | `list(string)` | `[]` | no |
| <a name="input_adot_config"></a> [adot\_config](#input\_adot\_config) | accept\_namespace\_regex defines the list of namespaces from which metrics will be exported, and additional\_metrics defines additional metrics to export. | <pre>object({<br> accept_namespace_regex = optional(string, "(default|kube-system)")<br> additional_metrics = optional(list(string), [])<br> log_group_name = optional(string, "adot")<br> log_retention = optional(number, 14)<br> helm_values = optional(any, null)<br> })</pre> | <pre>{<br> "accept_namespace_regex": "(default|kube-system)",<br> "additional_metrics": [],<br> "helm_values": null,<br> "log_group_name": "adot",<br> "log_retention": 14<br>}</pre> | no |
| <a name="input_adot_log_group_name"></a> [adot\_log\_group\_name](#input\_adot\_log\_group\_name) | ADOT log group name | `string` | `"adot_log_group_name"` | no |
| <a name="input_adot_version"></a> [adot\_version](#input\_adot\_version) | The version of the AWS Distro for OpenTelemetry addon to use. | `string` | `"v0.78.0-eksbuild.1"` | no |
| <a name="input_cluster_name"></a> [cluster\_name](#input\_cluster\_name) | K8s cluster name. | `string` | n/a | yes |
Expand Down
8 changes: 8 additions & 0 deletions modules/adot/locals.tf
Original file line number Diff line number Diff line change
Expand Up @@ -27,4 +27,12 @@ locals {

merged_metrics = concat(local.default_metrics, lookup(var.adot_config, "additional_metrics", []))
merged_namespace_specific = concat(local.default_metrics_namespace_specific, lookup(var.adot_config, "namespace_specific_metrics", []))

adot_policies = concat([
"${aws_iam_policy.adot.arn}",
"arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess",
"arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy",
"arn:aws:iam::aws:policy/AWSXrayWriteOnlyAccess"
], var.adot_collector_policy_arns)

}
1 change: 1 addition & 0 deletions modules/adot/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ resource "helm_release" "adot-collector" {
cluster_name = var.cluster_name
accept_namespace_regex = var.adot_config.accept_namespace_regex
log_group_name = var.adot_config.log_group_name
log_retention = var.adot_config.log_retention
metrics = local.merged_metrics
metrics_namespace_specific = local.merged_namespace_specific
prometheus_metrics = var.prometheus_metrics
Expand Down
39 changes: 37 additions & 2 deletions modules/adot/role.tf
Original file line number Diff line number Diff line change
Expand Up @@ -30,8 +30,43 @@ resource "aws_iam_role" "adot_collector" {
POLICY
}

resource "aws_iam_policy" "adot" {
name = "adot_policy"
path = "/"
description = "Adot Policy"

policy = jsonencode({
"Version" : "2012-10-17",
"Statement" : [
{
"Effect" : "Allow",
"Action" : [
"logs:PutLogEvents",
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:DescribeLogStreams",
"logs:DescribeLogGroups",
"logs:PutRetentionPolicy",
"xray:PutTraceSegments",
"xray:PutTelemetryRecords",
"xray:GetSamplingRules",
"xray:GetSamplingTargets",
"xray:GetSamplingStatisticSummaries",
"ssm:GetParameters"
],
"Resource" : "*"
}
]
})
}

resource "aws_iam_role_policy_attachment" "CloudWatchAgentServerPolicy" {
for_each = toset(var.adot_collector_policy_arns)
policy_arn = each.key
count = length(local.adot_policies)

policy_arn = local.adot_policies[count.index]
role = aws_iam_role.adot_collector.name

depends_on = [
aws_iam_policy.adot
]
}
2 changes: 2 additions & 0 deletions modules/adot/templates/adot-values.yaml.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,7 @@ adotCollector:
dimension_rollup_option: NoDimensionRollup
log_group_name: "${log_group_name}"
log_stream_name: "adot-metrics-prometheus"
log_retention: "${log_retention}"
metric_declarations:
- dimensions:
- - Namespace
Expand All @@ -150,6 +151,7 @@ adotCollector:
namespace: "ContainerInsights"
log_group_name: "${log_group_name}"
log_stream_name: "adot-metrics"
log_retention: "${log_retention}"
region: "${region}"
dimension_rollup_option: "NoDimensionRollup"
resource_to_telemetry_conversion:
Expand Down
17 changes: 10 additions & 7 deletions modules/adot/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -34,20 +34,23 @@ variable "create_namespace" {
variable "adot_collector_policy_arns" {
description = "List of IAM policy ARNs to attach to the ADOT collector service account."
type = list(string)
default = [
"arn:aws:iam::aws:policy/AmazonPrometheusRemoteWriteAccess",
"arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy",
"arn:aws:iam::aws:policy/AWSXrayWriteOnlyAccess"
]
default = []
}

variable "adot_config" {
description = "accept_namespace_regex defines the list of namespaces from which metrics will be exported, and additional_metrics defines additional metrics to export."
type = any
type = object({
accept_namespace_regex = optional(string, "(default|kube-system)")
additional_metrics = optional(list(string), [])
log_group_name = optional(string, "adot")
log_retention = optional(number, 14)
helm_values = optional(any, null)
})
default = {
accept_namespace_regex = "(default|kube-system)"
additional_metrics = []
log_group_name = "adot_log_group"
log_group_name = "adot"
log_retention = 14
# ADOT helm chart values.yaml, if you don't use variable adot will be deployed with module default values file
helm_values = null
}
Expand Down
2 changes: 1 addition & 1 deletion modules/autoscaler/kube-resoures.tf
Original file line number Diff line number Diff line change
Expand Up @@ -216,7 +216,7 @@ resource "kubernetes_deployment" "cluster-autoscaler" {
}
service_account_name = "cluster-autoscaler"
container {
image = "k8s.gcr.io/autoscaling/cluster-autoscaler:v${var.eks_version}.${var.autoscaler_image_patch}"
image = "registry.k8s.io/autoscaling/cluster-autoscaler:v${var.eks_version}.${var.autoscaler_image_patch}"
name = "cluster-autoscaler"

resources {
Expand Down
20 changes: 14 additions & 6 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ variable "node_groups" {
min_size = 2
max_size = 4
desired_size = 2
instance_types = ["t3.medium"]
instance_types = ["t3.large"]
iam_role_additional_policies = ["arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy"]
}
}
Expand Down Expand Up @@ -62,7 +62,7 @@ variable "node_groups_default" {
type = any
default = {
disk_size = 50
instance_types = ["t3.medium"]
instance_types = ["t3.large"]
iam_role_additional_policies = ["arn:aws:iam::aws:policy/CloudWatchAgentServerPolicy"]
}
}
Expand Down Expand Up @@ -306,16 +306,24 @@ variable "vpc" {

variable "metrics_exporter" {
type = string
default = "cloudwatch"
default = "adot"
description = "Metrics Exporter, can use cloudwatch or adot"
}

variable "adot_config" {
type = any
type = object({
accept_namespace_regex = optional(string, "(default|kube-system)")
additional_metrics = optional(list(string), [])
log_group_name = optional(string, "adot")
log_retention = optional(number, 14)
helm_values = optional(any, null)
})
description = "Adot configs"
default = {
accept_namespace_regex = "(default|kube-system)"
additional_metrics = {}
log_group_name = "adot_log_group"
additional_metrics = []
log_group_name = "adot"
log_retention = 14
}
}

Expand Down
Loading