Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terraform, azure and utoronto: fixes and aligning files with terraform state #3578

Merged
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/topic/infrastructure/cluster-design.md
Original file line number Diff line number Diff line change
Expand Up @@ -122,9 +122,9 @@ The three machine types based on the cloud provider are the following:
- r5.4xlarge
- r5.16xlarge
- [AKS](https://learn.microsoft.com/en-us/azure/virtual-machines/eav4-easv4-series)
- Standard_E4a_v4
- Standard_E16_v4
- Standard_E64_v4
- Standard_E4s_v5
- Standard_E16s_v5
- Standard_E64s_v5

## Network Policy

Expand Down
12 changes: 7 additions & 5 deletions terraform/azure/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -8,20 +8,23 @@ terraform {
# FIXME: v3 has been released and we are still at v2, see release notes:
# https://github.com/hashicorp/terraform-provider-azurerm/releases/tag/v3.0.0
#
# We may need to remove old state and then then import it according to
# https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/guides/3.0-upgrade-guide#migrating-to-new--renamed-resources.
#
source = "hashicorp/azurerm"
version = "~> 2.99"
}

azuread = {
# ref: https://registry.terraform.io/providers/hashicorp/azuread/latest
source = "hashicorp/azuread"
version = "~> 2.35"
version = "~> 2.47"
}

kubernetes = {
# ref: https://registry.terraform.io/providers/hashicorp/kubernetes/latest
source = "hashicorp/kubernetes"
version = "~> 2.18"
version = "~> 2.25"
}

}
Expand Down Expand Up @@ -92,9 +95,8 @@ resource "azurerm_kubernetes_cluster" "jupyterhub" {

# Core node-pool
default_node_pool {
name = "core"
node_count = 1
# Unfortunately, changing anything about VM type / size recreates *whole cluster
name = "core"
vm_size = var.core_node_vm_size
os_disk_size_gb = 40
enable_auto_scaling = true
Expand Down Expand Up @@ -197,7 +199,7 @@ resource "azurerm_container_registry" "container_registry" {
name = var.global_container_registry_name
resource_group_name = azurerm_resource_group.jupyterhub.name
location = azurerm_resource_group.jupyterhub.location
sku = "premium"
sku = "Premium"
admin_enabled = true
}

Expand Down
29 changes: 16 additions & 13 deletions terraform/azure/projects/utoronto.tfvars
Original file line number Diff line number Diff line change
@@ -1,23 +1,26 @@
tenant_id = "78aac226-2f03-4b4d-9037-b46d56c55210"
subscription_id = "ead3521a-d994-4a44-a68d-b16e35642d5b"
resourcegroup_name = "2i2c-utoronto-cluster"


kubernetes_version = "1.26.3"
tenant_id = "78aac226-2f03-4b4d-9037-b46d56c55210"
subscription_id = "ead3521a-d994-4a44-a68d-b16e35642d5b"
resourcegroup_name = "2i2c-utoronto-cluster"
global_container_registry_name = "2i2cutorontohubregistry"
global_storage_account_name = "2i2cutorontohubstorage"
location = "canadacentral"

storage_size = 8192
ssh_pub_key = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDQJ4h39UYNi1wybxAH+jCFkNK2aqRcuhDkQSMx0Hak5xkbt3KnT3cOwAgUP1Vt/SjhltSTuxpOHxiAKCRnjwRk60SxKhUNzPHih2nkfYTmBBjmLfdepDPSke/E0VWvTDIEXz/L8vW8aI0QGPXnXyqzEDO9+U1buheBlxB0diFAD3vEp2SqBOw+z7UgrGxXPdP+2b3AV+X6sOtd6uSzpV8Qvdh+QAkd4r7h9JrkFvkrUzNFAGMjlTb0Lz7qAlo4ynjEwzVN2I1i7cVDKgsGz9ZG/8yZfXXx+INr9jYtYogNZ63ajKR/dfjNPovydhuz5zQvQyxpokJNsTqt1CiWEUNj georgiana@georgiana"

ssh_pub_key = "ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDQJ4h39UYNi1wybxAH+jCFkNK2aqRcuhDkQSMx0Hak5xkbt3KnT3cOwAgUP1Vt/SjhltSTuxpOHxiAKCRnjwRk60SxKhUNzPHih2nkfYTmBBjmLfdepDPSke/E0VWvTDIEXz/L8vW8aI0QGPXnXyqzEDO9+U1buheBlxB0diFAD3vEp2SqBOw+z7UgrGxXPdP+2b3AV+X6sOtd6uSzpV8Qvdh+QAkd4r7h9JrkFvkrUzNFAGMjlTb0Lz7qAlo4ynjEwzVN2I1i7cVDKgsGz9ZG/8yZfXXx+INr9jYtYogNZ63ajKR/dfjNPovydhuz5zQvQyxpokJNsTqt1CiWEUNj georgiana@georgiana"

global_container_registry_name = "2i2cutorontohubregistry"
global_storage_account_name = "2i2cutorontohubstorage"
# FIXME: upgrade to 1.27.7, and then 1.28.3, based on the latest versions
# available via: az aks get-versions --location westus2 -o table
#
kubernetes_version = "1.26.3"

location = "canadacentral"
# FIXME: upgrade core_node_vm_size to Standard_E4s_v5
core_node_vm_size = "Standard_E4s_v3"

notebook_nodes = {
"default" : {
min : 1,
max : 100,
min : 0,
max : 86,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason 86 was chosen?

Copy link
Member Author

@consideRatio consideRatio Jan 5, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated those values to align with the current configuration, but I don't know the history on why they are configured to 0 - 86.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@consideRatio ah, ok. Can you just add a comment about that here? Once that's done, this is good to go. Thanks for cleaning this up!

# FIXME: upgrade user nodes vm_size to Standard_E8s_v5
vm_size : "Standard_E8s_v3",
}
}
7 changes: 7 additions & 0 deletions terraform/azure/proxycommand.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,11 @@
#!/usr/bin/env python3
"""
This script can be used to migrate Azure Files storage from one cluster to
another.

Learn more at https://infrastructure.2i2c.org/hub-deployment-guide/hubs/other-hub-ops/move-hubs/across-clusters/#azure-files.
"""

import subprocess
import sys
import time
Expand Down
8 changes: 1 addition & 7 deletions terraform/azure/storage.tf
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ resource "azurerm_storage_share" "homes" {
name = "homes"
storage_account_name = azurerm_storage_account.homes.name
quota = var.storage_size
enabled_protocol = var.storage_protocol
enabled_protocol = "NFS"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In particular, this is great! Let's rip out any other Azure File mounting related stuff we may have too (we use NFS for mounting that too now)

lifecycle {
# Additional safeguard against deleting the share
# as this causes irreversible data loss!
Expand All @@ -34,9 +34,3 @@ resource "azurerm_storage_share" "homes" {
output "azure_fileshare_url" {
value = azurerm_storage_share.homes.url
}

resource "kubernetes_namespace" "homes" {
metadata {
name = "azure-file"
}
}