Skip to content

Commit

Permalink
feat: [PAYMCLOUD-255] Add Prometheus Managed for AKS and enhance moni…
Browse files Browse the repository at this point in the history
…toring setup (#2837)

Add Prometheus and monitoring enhancements

Introduced Prometheus managed add-on for AKS in non-production environments and enhanced monitoring setup with workspace and private endpoints. Updated Terraform modules and configurations to support these improvements, ensuring better observability and integration.

Signed-off-by: Fabio Felici <[email protected]>
  • Loading branch information
ffppa authored Feb 27, 2025
1 parent c5fe0c4 commit 0cbe69d
Show file tree
Hide file tree
Showing 12 changed files with 155 additions and 37 deletions.
50 changes: 21 additions & 29 deletions src/aks-leonardo/.terraform.lock.hcl

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

9 changes: 8 additions & 1 deletion src/aks-leonardo/03_aks_0.tf
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ resource "azurerm_resource_group" "rg_aks" {
}

module "aks_leonardo" {
source = "git::https://github.com/pagopa/terraform-azurerm-v3.git//kubernetes_cluster?ref=v8.58.0"
source = "git::https://github.com/pagopa/terraform-azurerm-v3.git//kubernetes_cluster?ref=v8.80.0"

name = local.aks_cluster_name
location = var.location
Expand All @@ -16,6 +16,13 @@ module "aks_leonardo" {
log_analytics_workspace_id = var.env_short != "d" ? data.azurerm_log_analytics_workspace.log_analytics_italy.id : data.azurerm_log_analytics_workspace.log_analytics.id
sku_tier = var.aks_sku_tier

## Prometheus managed
# ff: enabled on DEV/UAT
enable_prometheus_monitor_metrics = var.env_short != "p" ? true : false

# ff: Enabled cost analysis on UAT/PROD
# cost_analysis_enabled = var.env_short != "d" ? true : false

#
# 🤖 System node pool
#
Expand Down
37 changes: 37 additions & 0 deletions src/aks-leonardo/03_monitoring.tf
Original file line number Diff line number Diff line change
Expand Up @@ -64,3 +64,40 @@ data "azurerm_key_vault_secret" "opsgenie_kubexporter_api_key" {

// TODO mettere nel kv il secret quickstart-es-elastic-user tramite sops


## PROMETHUES MANAGED ON AKS
# Refer: Resource created on next-core 02_monitor.tf
data "azurerm_monitor_workspace" "workspace" {
count = var.env != "prod" ? 1 : 0
name = "pagopa-${var.env_short}-${var.location}-monitor-workspace"
resource_group_name = "pagopa-${var.env_short}-monitor-rg"
}

module "prometheus_managed_addon" {
count = var.env != "prod" ? 1 : 0
source = "git::https://github.com/pagopa/terraform-azurerm-v3.git//kubernetes_prometheus_managed?ref=v8.84.0"
cluster_name = module.aks_leonardo.name
resource_group_name = module.aks_leonardo.aks_resource_group_name
location = var.location
custom_gf_location = "westeurope"
location_short = var.location_short
monitor_workspace_name = data.azurerm_monitor_workspace.workspace.0.name
monitor_workspace_rg = data.azurerm_monitor_workspace.workspace.0.resource_group_name
grafana_name = "pagopa-${var.env_short}-weu-grafana" # Integrate with weu grafana
grafana_resource_group = "pagopa-${var.env_short}-weu-grafana-rg" # Integrate with weu grafana

# takes a list and replaces any elements that are lists with a
# flattened sequence of the list contents.
# In this case, we enable OpsGenie only on prod env
action_groups_id = flatten([
[
data.azurerm_monitor_action_group.slack.id,
data.azurerm_monitor_action_group.email.id
],
(var.env == "prod" ? [
data.azurerm_monitor_action_group.opsgenie.0.id
] : [])
])

tags = var.tags
}
4 changes: 3 additions & 1 deletion src/aks-leonardo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,13 +40,14 @@ Re-enable all the resource, commented before to complete the procedure

| Name | Source | Version |
|------|--------|---------|
| <a name="module_aks_leonardo"></a> [aks\_leonardo](#module\_aks\_leonardo) | git::https://github.com/pagopa/terraform-azurerm-v3.git//kubernetes_cluster | v8.58.0 |
| <a name="module_aks_leonardo"></a> [aks\_leonardo](#module\_aks\_leonardo) | git::https://github.com/pagopa/terraform-azurerm-v3.git//kubernetes_cluster | v8.80.0 |
| <a name="module_aks_prometheus_install"></a> [aks\_prometheus\_install](#module\_aks\_prometheus\_install) | git::https://github.com/pagopa/terraform-azurerm-v3.git//kubernetes_prometheus_install | v8.78.1 |
| <a name="module_aks_storage_class"></a> [aks\_storage\_class](#module\_aks\_storage\_class) | git::https://github.com/pagopa/terraform-azurerm-v3.git//kubernetes_storage_class | v8.17.1 |
| <a name="module_elastic_agent"></a> [elastic\_agent](#module\_elastic\_agent) | git::https://github.com/pagopa/terraform-azurerm-v3.git//elastic_agent | v8.50.0 |
| <a name="module_keda_pod_identity"></a> [keda\_pod\_identity](#module\_keda\_pod\_identity) | git::https://github.com/pagopa/terraform-azurerm-v3.git//kubernetes_pod_identity | v8.17.1 |
| <a name="module_kubernetes_event_exporter"></a> [kubernetes\_event\_exporter](#module\_kubernetes\_event\_exporter) | git::https://github.com/pagopa/terraform-azurerm-v3.git//kubernetes_event_exporter | v8.76.0 |
| <a name="module_nginx_ingress"></a> [nginx\_ingress](#module\_nginx\_ingress) | terraform-module/release/helm | 2.7.0 |
| <a name="module_prometheus_managed_addon"></a> [prometheus\_managed\_addon](#module\_prometheus\_managed\_addon) | git::https://github.com/pagopa/terraform-azurerm-v3.git//kubernetes_prometheus_managed | v8.84.0 |

## Resources

Expand Down Expand Up @@ -94,6 +95,7 @@ Re-enable all the resource, commented before to complete the procedure
| [azurerm_monitor_action_group.email](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/monitor_action_group) | data source |
| [azurerm_monitor_action_group.opsgenie](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/monitor_action_group) | data source |
| [azurerm_monitor_action_group.slack](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/monitor_action_group) | data source |
| [azurerm_monitor_workspace.workspace](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/monitor_workspace) | data source |
| [azurerm_public_ip.pip_aks_outboud](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/public_ip) | data source |
| [azurerm_resource_group.monitor_italy_rg](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/resource_group) | data source |
| [azurerm_resource_group.monitor_rg](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/data-sources/resource_group) | data source |
Expand Down
1 change: 0 additions & 1 deletion src/aks-platform/05_monitoring.tf
Original file line number Diff line number Diff line change
Expand Up @@ -112,7 +112,6 @@ resource "kubernetes_manifest" "service_monitor" {
"app.kubernetes.io/instance" : "prometheus"
"app.kubernetes.io/part-of" : "kube-prometheus-stack"
"app" : "kube-prometheus-stack-operator"
"release" : helm_release.kube_prometheus_stack.name
}
}
"spec" : {
Expand Down
14 changes: 14 additions & 0 deletions src/core-itn/01_network.tf
Original file line number Diff line number Diff line change
Expand Up @@ -91,3 +91,17 @@ resource "azurerm_subnet" "subnet_container_app_tools" {
virtual_network_name = module.vnet_italy[0].name
address_prefixes = var.cidr_subnet_tools_cae
}

# subnet acr
module "common_private_endpoint_snet" {
source = "git::https://github.com/pagopa/terraform-azurerm-v3.git//subnet?ref=v8.83.0"
name = "${local.product}-common-private-endpoint-snet"
address_prefixes = var.cidr_common_private_endpoint_snet
resource_group_name = azurerm_resource_group.rg_ita_vnet.name
virtual_network_name = module.vnet_italy.0.name

private_link_service_network_policies_enabled = true


service_endpoints = var.env_short == "p" ? ["Microsoft.Storage"] : []
}
52 changes: 52 additions & 0 deletions src/core-itn/70_monitoring.tf
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,58 @@ resource "azurerm_log_analytics_workspace" "log_analytics_workspace" {
}
}

# Azure Monitor Workspace
resource "azurerm_monitor_workspace" "monitor_workspace" {
count = var.env != "prod" ? 1 : 0
name = "${var.prefix}-${var.env_short}-${var.location}-monitor-workspace"
resource_group_name = "${var.prefix}-${var.env_short}-monitor-rg"
location = var.location
public_network_access_enabled = false
tags = var.tags
}

# Create workspace private DNS zone
resource "azurerm_private_dns_zone" "prometheus_dns_zone" {
count = var.env != "prod" ? 1 : 0
name = "privatelink.${var.location}.prometheus.monitor.azure.com"
resource_group_name = module.vnet_italy.0.resource_group_name
}

# Create virtual network link for workspace private dns zone
resource "azurerm_private_dns_zone_virtual_network_link" "prometheus_dns_zone_vnet_link" {
count = var.env != "prod" ? 1 : 0
name = module.vnet_italy.0.name
resource_group_name = module.vnet_italy.0.resource_group_name
virtual_network_id = module.vnet_italy.0.id
private_dns_zone_name = azurerm_private_dns_zone.prometheus_dns_zone.0.name
}

resource "azurerm_private_endpoint" "monitor_workspace_private_endpoint" {
count = var.env != "prod" ? 1 : 0
name = "${var.prefix}-${var.location}-monitor-workspace-pe"
location = azurerm_monitor_workspace.monitor_workspace.0.location
resource_group_name = azurerm_monitor_workspace.monitor_workspace.0.resource_group_name
subnet_id = module.common_private_endpoint_snet.id

private_service_connection {
name = "monitorworkspaceconnection"
private_connection_resource_id = azurerm_monitor_workspace.monitor_workspace[0].id
is_manual_connection = false
subresource_names = ["prometheusMetrics"]
}

private_dns_zone_group {
name = "${var.prefix}-workspace-zone-group"
private_dns_zone_ids = [azurerm_private_dns_zone.prometheus_dns_zone.0.id]
}


depends_on = [azurerm_monitor_workspace.monitor_workspace]

tags = var.tags
}


# Application insights
resource "azurerm_application_insights" "application_insights" {
name = "${local.project}-appinsights"
Expand Down
5 changes: 5 additions & 0 deletions src/core-itn/99_variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,11 @@ variable "cidr_subnet_tools_cae" {
description = "Address prefixes for container apps Tools in italy."
}

variable "cidr_common_private_endpoint_snet" {
type = list(string)
description = "Common Private Endpoint network address space."
}

### External resources

variable "monitor_resource_group_name" {
Expand Down
6 changes: 6 additions & 0 deletions src/core-itn/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ No outputs.

| Name | Source | Version |
|------|--------|---------|
| <a name="module_common_private_endpoint_snet"></a> [common\_private\_endpoint\_snet](#module\_common\_private\_endpoint\_snet) | git::https://github.com/pagopa/terraform-azurerm-v3.git//subnet | v8.83.0 |
| <a name="module_container_registry_ita"></a> [container\_registry\_ita](#module\_container\_registry\_ita) | git::https://github.com/pagopa/terraform-azurerm-v3.git//container_registry | v8.13.0 |
| <a name="module_domain_key_vault_secrets_query"></a> [domain\_key\_vault\_secrets\_query](#module\_domain\_key\_vault\_secrets\_query) | git::https://github.com/pagopa/terraform-azurerm-v3.git//key_vault_secrets_query | v8.13.0 |
| <a name="module_key_vault"></a> [key\_vault](#module\_key\_vault) | git::https://github.com/pagopa/terraform-azurerm-v3.git//key_vault | v8.13.0 |
Expand All @@ -129,6 +130,8 @@ No outputs.
| [azurerm_log_analytics_workspace.log_analytics_workspace](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/log_analytics_workspace) | resource |
| [azurerm_monitor_action_group.email](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/monitor_action_group) | resource |
| [azurerm_monitor_action_group.slack](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/monitor_action_group) | resource |
| [azurerm_monitor_workspace.monitor_workspace](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/monitor_workspace) | resource |
| [azurerm_private_dns_zone.prometheus_dns_zone](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/private_dns_zone) | resource |
| [azurerm_private_dns_zone_virtual_network_link.db_nodo_pagamenti_com_vnet_link](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/private_dns_zone_virtual_network_link) | resource |
| [azurerm_private_dns_zone_virtual_network_link.env_platform_pagopa_it_vnet_link](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/private_dns_zone_virtual_network_link) | resource |
| [azurerm_private_dns_zone_virtual_network_link.internal_env_platform_pagopa_it_vnet_link](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/private_dns_zone_virtual_network_link) | resource |
Expand All @@ -143,6 +146,8 @@ No outputs.
| [azurerm_private_dns_zone_virtual_network_link.privatelink_servicebus_windows_net_vnet_link](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/private_dns_zone_virtual_network_link) | resource |
| [azurerm_private_dns_zone_virtual_network_link.privatelink_table_core_windows_net_vnet_link](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/private_dns_zone_virtual_network_link) | resource |
| [azurerm_private_dns_zone_virtual_network_link.privatelink_table_cosmos_azure_com_vnet_link](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/private_dns_zone_virtual_network_link) | resource |
| [azurerm_private_dns_zone_virtual_network_link.prometheus_dns_zone_vnet_link](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/private_dns_zone_virtual_network_link) | resource |
| [azurerm_private_endpoint.monitor_workspace_private_endpoint](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/private_endpoint) | resource |
| [azurerm_public_ip.aks_leonardo_public_ip](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/public_ip) | resource |
| [azurerm_resource_group.acr_ita_rg](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/resource_group) | resource |
| [azurerm_resource_group.monitor_rg](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/resource_group) | resource |
Expand Down Expand Up @@ -184,6 +189,7 @@ No outputs.

| Name | Description | Type | Default | Required |
|------|-------------|------|---------|:--------:|
| <a name="input_cidr_common_private_endpoint_snet"></a> [cidr\_common\_private\_endpoint\_snet](#input\_cidr\_common\_private\_endpoint\_snet) | Common Private Endpoint network address space. | `list(string)` | n/a | yes |
| <a name="input_cidr_eventhubs_italy"></a> [cidr\_eventhubs\_italy](#input\_cidr\_eventhubs\_italy) | Address prefixes for all evenhubs in italy. | `list(string)` | n/a | yes |
| <a name="input_cidr_subnet_pdf_engine_app_service"></a> [cidr\_subnet\_pdf\_engine\_app\_service](#input\_cidr\_subnet\_pdf\_engine\_app\_service) | CIDR subnet for App Service | `list(string)` | `null` | no |
| <a name="input_cidr_subnet_tools_cae"></a> [cidr\_subnet\_tools\_cae](#input\_cidr\_subnet\_tools\_cae) | Address prefixes for container apps Tools in italy. | `list(string)` | n/a | yes |
Expand Down
Loading

0 comments on commit 0cbe69d

Please sign in to comment.