Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add LEAP hub #1074

Merged
merged 14 commits into from
Mar 24, 2022
Merged

Add LEAP hub #1074

merged 14 commits into from
Mar 24, 2022

Conversation

yuvipanda
Copy link
Member

Ref #1050

Copy link
Member

@sgibson91 sgibson91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One typo to fix and one non-blocking comment. Otherwise, LGTM!

@@ -0,0 +1,29 @@
name: pangeo-hubs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
name: pangeo-hubs
name: leap

I think this should be leap, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes! Fixed.

GitHubOAuthenticator:
client_id: ENC[AES256_GCM,data:auopWrLSGIBDtQ4PrZZMYe5XFWM=,iv:+tzBIkE6R3PfJm7oYyJOq84yyD6tB3GXeQ++sYPU7S8=,tag:vQrhczBhRRaFoqqwRWeGHg==,type:str]
client_secret: ENC[AES256_GCM,data:xLL5GJTKSnucssmIQjVhCUwwXyZaYl54/+QzXPFx0dJpX63kaeJufw==,iv:2cyHZvDaoNQtlKiPKf2ACoNuvlww6WE7vcGG6jVXISI=,tag:IRcogW5ZDZWA6Pv8DyVcPg==,type:str]
oauth_callback_url: ENC[AES256_GCM,data:d+/oCcmELV7Tvfe86P4YH8DCnLHI0yid0WoUWH0IKT022b9Ba/rnptYp,iv:SVHQK5yK26JOHV6uWycsLYUk42g6Kl8RahOf5oMbqxc=,tag:HjT+HYo30qqP5yCKcgVZCQ==,type:str]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in two minds about having oauth_callback_url here. I know it makes the number of files you need to pass around nicer, but this value doesn't need to be encrypted really. And I did lose some time thinking that the uwhackweek hubs were misconfigured because of this pattern before.

No specific action to take in this PR, but I'd like to discuss as a team some guidelines around this for readability's sake.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgibson91 yeah I was thinking about that too. While keeping the number of files down helps, I think keeping it separate works alright for me too. But in that case I want to split it into three files - common, staging and prod, rather than just staging and prod. How does that sound?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds reasonable. The deployer will work regardless, so long as they're listed in the cluster.yaml file! 🎉

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgibson91 i've split this now into two non-secret files staging.yaml + prod.yaml.

@yuvipanda yuvipanda mentioned this pull request Mar 10, 2022
10 tasks
@yuvipanda
Copy link
Member Author

@sgibson91 can you take another look with the changes I've made? It's all deployed now.

filestore_capacity_gb = 1024

user_buckets = [
"pangeo-scratch"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't we establish elsewhere that this will create a bucket called Pangeo scratch, but JupyterHub is configured to look for a bucket called <cluster_name>-<hub_name> and so this bucket will never get connected to the hub?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See slack message: https://2i2c.slack.com/archives/C028WU9PFBN/p1637755725001500?thread_ts=1637754043.001100&cid=C028WU9PFBN

And what JupyterHub is expecting:

bucket_name = f'{project_id}-{release}-scratch-bucket'

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sgibson91 ah good catch. see d83461a.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect!

Copy link
Member

@sgibson91 sgibson91 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I have one question about the scratch buckets, but the rest is great.

@yuvipanda
Copy link
Member Author

@sgibson91 aha, looks like scratch bucket access doesn't actually work. Boo. I'll investigate.

@sgibson91
Copy link
Member

@yuvipanda this issue may be relevant #1046

Assign one user per node, using https://learnk8s.io/kubernetes-instance-calculator
to calculate how big to make guarantees so pods stick
one to a node. This provides a reasonable tradeoff for research
use cases I think, although it should still be discussed.
We aren't using config-connector anymore, so these
need to be set explicitly.
- GCS storage protocol is gs, not gcs
- Just hardcode the bucket name here, as env var substitution
  relies on ordering of env vars and that is just a bit messy.
@yuvipanda yuvipanda merged commit 70b07c5 into 2i2c-org:master Mar 24, 2022
@yuvipanda
Copy link
Member Author

Thanks for the review, @damianavila and @sgibson91! This also doesn't have fully functional scratch buckets, something that'll be fixed in #1130.

@yuvipanda
Copy link
Member Author

I also ran a terraform apply, and verified it is as clean as it can be. There's still changes that are see-sawing (with no actual effects) that'll be fixed in #1130.

google_artifact_registry_repository.registry: Refreshing state... [id=projects/leap-pangeo/locations/us-central1/repositories/leap-registry]
google_service_account.cluster_sa: Refreshing state... [id=projects/leap-pangeo/serviceAccounts/[email protected]]
google_service_account.cd_sa: Refreshing state... [id=projects/leap-pangeo/serviceAccounts/[email protected]]
google_project_iam_custom_role.identify_project_role: Refreshing state... [id=projects/leap-pangeo/roles/leap_user_sa_role]
google_storage_bucket.user_buckets["pangeo-scratch"]: Refreshing state... [id=leap-pangeo-scratch]
google_project_iam_member.cd_sa_roles["roles/container.admin"]: Refreshing state... [id=leap-pangeo/roles/container.admin/serviceAccount:[email protected]]
google_project_iam_member.cd_sa_roles["roles/artifactregistry.writer"]: Refreshing state... [id=leap-pangeo/roles/artifactregistry.writer/serviceAccount:[email protected]]
google_project_iam_member.cluster_sa_roles["roles/monitoring.metricWriter"]: Refreshing state... [id=leap-pangeo/roles/monitoring.metricWriter/serviceAccount:[email protected]]
google_project_iam_member.cluster_sa_roles["roles/logging.logWriter"]: Refreshing state... [id=leap-pangeo/roles/logging.logWriter/serviceAccount:[email protected]]
google_service_account_key.cd_sa: Refreshing state... [id=projects/leap-pangeo/serviceAccounts/[email protected]/keys/8192f1f77328895d53a3c3c2cb53cefa78cb4cc5]
google_project_iam_member.cluster_sa_roles["roles/monitoring.viewer"]: Refreshing state... [id=leap-pangeo/roles/monitoring.viewer/serviceAccount:[email protected]]
google_project_iam_member.cluster_sa_roles["roles/stackdriver.resourceMetadata.writer"]: Refreshing state... [id=leap-pangeo/roles/stackdriver.resourceMetadata.writer/serviceAccount:[email protected]]
google_project_iam_member.cluster_sa_roles["roles/artifactregistry.reader"]: Refreshing state... [id=leap-pangeo/roles/artifactregistry.reader/serviceAccount:[email protected]]
google_storage_bucket_iam_member.member["pangeo-scratch"]: Refreshing state... [id=b/leap-pangeo-scratch/roles/storage.admin/serviceAccount:[email protected]]
google_filestore_instance.homedirs[0]: Refreshing state... [id=projects/leap-pangeo/locations/us-central1-b/instances/leap-homedirs]
google_container_cluster.cluster: Refreshing state... [id=projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster]
google_project_iam_member.identify_project_binding: Refreshing state... [id=leap-pangeo/projects/leap-pangeo/roles/leap_user_sa_role/serviceAccount:[email protected]]
google_container_node_pool.dask_worker["huge"]: Refreshing state... [id=projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/dask-huge]
google_container_node_pool.notebook["large"]: Refreshing state... [id=projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/nb-large]
google_container_node_pool.dask_worker["medium"]: Refreshing state... [id=projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/dask-medium]
google_container_node_pool.dask_worker["large"]: Refreshing state... [id=projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/dask-large]
google_container_node_pool.notebook["huge"]: Refreshing state... [id=projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/nb-huge]
google_container_node_pool.dask_worker["small"]: Refreshing state... [id=projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/dask-small]
google_container_node_pool.notebook["medium"]: Refreshing state... [id=projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/nb-medium]
google_container_node_pool.core: Refreshing state... [id=projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/core-pool]
google_container_node_pool.notebook["small"]: Refreshing state... [id=projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/nb-small]

Note: Objects have changed outside of Terraform

Terraform detected the following changes made outside of Terraform since the
last "terraform apply":

google_container_cluster.cluster has changed

~ resource "google_container_cluster" "cluster" {
id = "projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster"
name = "leap-cluster"
# (26 unchanged attributes hidden)

  ~ node_pool {
        name                        = "nb-small"
      ~ node_count                  = 0 -> 1
        # (6 unchanged attributes hidden)




        # (4 unchanged blocks hidden)
    }




    # (22 unchanged blocks hidden)
}

google_container_node_pool.notebook["small"] has changed

~ resource "google_container_node_pool" "notebook" {
id = "projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/nb-small"
name = "nb-small"
~ node_count = 0 -> 1
# (8 unchanged attributes hidden)

    # (4 unchanged blocks hidden)
}

Unless you have made equivalent changes to your configuration, or ignored the
relevant attributes using ignore_changes, the following plan may include
actions to undo or respond to these changes.

─────────────────────────────────────────────────────────────────────────────

Terraform used the selected providers to generate the following execution
plan. Resource actions are indicated with the following symbols:
~ update in-place

Terraform will perform the following actions:

google_container_node_pool.dask_worker["huge"] will be updated in-place

~ resource "google_container_node_pool" "dask_worker" {
id = "projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/dask-huge"
name = "dask-huge"
# (9 unchanged attributes hidden)

  ~ node_config {
        tags              = []
        # (12 unchanged attributes hidden)


      ~ workload_metadata_config {
          ~ mode = "GKE_METADATA" -> "MODE_UNSPECIFIED"
        }
        # (1 unchanged block hidden)
    }

    # (3 unchanged blocks hidden)
}

google_container_node_pool.dask_worker["large"] will be updated in-place

~ resource "google_container_node_pool" "dask_worker" {
id = "projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/dask-large"
name = "dask-large"
# (9 unchanged attributes hidden)

  ~ node_config {
        tags              = []
        # (12 unchanged attributes hidden)


      ~ workload_metadata_config {
          ~ mode = "GKE_METADATA" -> "MODE_UNSPECIFIED"
        }
        # (1 unchanged block hidden)
    }

    # (3 unchanged blocks hidden)
}

google_container_node_pool.dask_worker["medium"] will be updated in-place

~ resource "google_container_node_pool" "dask_worker" {
id = "projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/dask-medium"
name = "dask-medium"
# (9 unchanged attributes hidden)

  ~ node_config {
        tags              = []
        # (12 unchanged attributes hidden)


      ~ workload_metadata_config {
          ~ mode = "GKE_METADATA" -> "MODE_UNSPECIFIED"
        }
        # (1 unchanged block hidden)
    }

    # (3 unchanged blocks hidden)
}

google_container_node_pool.dask_worker["small"] will be updated in-place

~ resource "google_container_node_pool" "dask_worker" {
id = "projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/dask-small"
name = "dask-small"
# (9 unchanged attributes hidden)

  ~ node_config {
        tags              = []
        # (12 unchanged attributes hidden)


      ~ workload_metadata_config {
          ~ mode = "GKE_METADATA" -> "MODE_UNSPECIFIED"
        }
        # (1 unchanged block hidden)
    }

    # (3 unchanged blocks hidden)
}

google_container_node_pool.notebook["huge"] will be updated in-place

~ resource "google_container_node_pool" "notebook" {
id = "projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/nb-huge"
name = "nb-huge"
# (9 unchanged attributes hidden)

  ~ node_config {
        tags              = []
        # (12 unchanged attributes hidden)


      ~ workload_metadata_config {
          ~ mode = "GKE_METADATA" -> "MODE_UNSPECIFIED"
        }
        # (1 unchanged block hidden)
    }

    # (3 unchanged blocks hidden)
}

google_container_node_pool.notebook["large"] will be updated in-place

~ resource "google_container_node_pool" "notebook" {
id = "projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/nb-large"
name = "nb-large"
# (9 unchanged attributes hidden)

  ~ node_config {
        tags              = []
        # (12 unchanged attributes hidden)


      ~ workload_metadata_config {
          ~ mode = "GKE_METADATA" -> "MODE_UNSPECIFIED"
        }
        # (1 unchanged block hidden)
    }

    # (3 unchanged blocks hidden)
}

google_container_node_pool.notebook["medium"] will be updated in-place

~ resource "google_container_node_pool" "notebook" {
id = "projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/nb-medium"
name = "nb-medium"
# (9 unchanged attributes hidden)

  ~ node_config {
        tags              = []
        # (12 unchanged attributes hidden)


      ~ workload_metadata_config {
          ~ mode = "GKE_METADATA" -> "MODE_UNSPECIFIED"
        }
        # (1 unchanged block hidden)
    }

    # (3 unchanged blocks hidden)
}

google_container_node_pool.notebook["small"] will be updated in-place

~ resource "google_container_node_pool" "notebook" {
id = "projects/leap-pangeo/locations/us-central1-b/clusters/leap-cluster/nodePools/nb-small"
name = "nb-small"
# (9 unchanged attributes hidden)

  ~ node_config {
        tags              = []
        # (12 unchanged attributes hidden)


      ~ workload_metadata_config {
          ~ mode = "GKE_METADATA" -> "MODE_UNSPECIFIED"
        }
        # (1 unchanged block hidden)
    }

    # (3 unchanged blocks hidden)
}

Plan: 0 to add, 8 to change, 0 to destroy.

Warning: Argument is deprecated

with google_filestore_instance.homedirs,
on storage.tf line 4, in resource "google_filestore_instance" "homedirs":
4: zone = var.zone

Deprecated in favor of location.

(and one more similar warning elsewhere)

─────────────────────────────────────────────────────────────────────────────

Note: You didn't use the -out option to save this plan, so Terraform can't
guarantee to take exactly these actions if you run "terraform apply" now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants