-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Debug why helm chart/config connector doesn't create a new service account #669
Comments
I tried deploying this in two-eye-two-see-sandbox and the [email protected] service account is not created there either. So this is a general problem, not Pangeo-specific. |
Looking quickly at the pangeo yaml files:
I do not see
on those above yaml files as I see, for instance, in the 2i2c cluster one: https://github.com/2i2c-org/pilot-hubs/blob/master/config/hubs/2i2c.cluster.yaml#L63-L70) And that "key" is expected to be Am I missing some other Pangeo config file where |
Sorry, it is not yet in the config file in master. But when I DO deploy with that config present - it doesn't work. I will share the config I deployed to set up the test cluster in |
Config file I used to deploy a test hub onto the cluster - you can see the
I left this up over night as I saw in the config connector docs
which lead me to wonder if I was just being impatient again 😉 However, you can see here that there is no service account with the expected name
|
There should be pods in the |
On the Pangeo hubs, I manually created the Hopefully once I fought #653 and #741, I will have some more time to come back to this. |
I have deployed the cluster and hub again. In the namespace |
Log output grepped for `serviceaccount`
|
Other things to think about to get this working on Pangeo cluster:
|
Update: plan to use TerraformIn a team meeting, we discussed our options here and decided that using Terraform is the right way to go rather than the GCP "ConfigConnector" service, because this will be easier to do quickly. |
The GKE config connector was helpful in letting us deploy Google Cloud Service Accounts with permissions for cloud storage directly just from helm. However, it has been difficult to debug, and in 2i2c-org#669 we decided to move away from it and towards creating these cloud resources via Terraform. This commit adds: - Terraform code that will create a Google Service Account, bind it to a given Kubernetes Service Account, for a list of hub namespaces passed in. This means that some hub initial deployments now *can not be done just with CD*, but need manual work with terraform. I think this would be any hub that wants to use requestor pays or scratch buckets. This would need to be documented. - Move meom-ige to use this new scheme. metadata concealment (https://cloud.google.com/kubernetes-engine/docs/how-to/protecting-cluster-metadata#concealment) which is what we were using earlier as alternative to config-connector + workload identity, is no longer supported by the terraform google provider. In b7b42ce, we changed the default from 'SECURE' to 'UNSPECIFIED', but it looks like 'UNSPECIFIED' really means 'use workload identity' haha. When 2i2c-org#1124 was deployed to meom-ige yesterday, it seems to have enabled workload identity, causing cloud access to stop working, leading to https://2i2c.freshdesk.com/a/tickets/107. Further investigation on what happened here is needed, but I've currently fixed it by just deploying this change for meom-ige. - All hubs are given access to all buckets we create. This is inadequete, and needs to be more fine grained. Ref 2i2c-org#669 Ref 2i2c-org#1046
The GKE config connector was helpful in letting us deploy Google Cloud Service Accounts with permissions for cloud storage directly just from helm. However, it has been difficult to debug, and in 2i2c-org#669 we decided to move away from it and towards creating these cloud resources via Terraform. This commit adds: - Terraform code that will create a Google Service Account, bind it to a given Kubernetes Service Account, for a list of hub namespaces passed in. This means that some hub initial deployments now *can not be done just with CD*, but need manual work with terraform. I think this would be any hub that wants to use requestor pays or scratch buckets. This would need to be documented. - Move meom-ige to use this new scheme. metadata concealment (https://cloud.google.com/kubernetes-engine/docs/how-to/protecting-cluster-metadata#concealment) which is what we were using earlier as alternative to config-connector + workload identity, is no longer supported by the terraform google provider. In b7b42ce, we changed the default from 'SECURE' to 'UNSPECIFIED', but it looks like 'UNSPECIFIED' really means 'use workload identity' haha. When 2i2c-org#1124 was deployed to meom-ige yesterday, it seems to have enabled workload identity, causing cloud access to stop working, leading to https://2i2c.freshdesk.com/a/tickets/107. Further investigation on what happened here is needed, but I've currently fixed it by just deploying this change for meom-ige. - All hubs are given access to all buckets we create. This is inadequete, and needs to be more fine grained. Ref 2i2c-org#669 Ref 2i2c-org#1046
This is done except for #1153! \o/ Thank you for blazing the trail here, @sgibson91! |
Description
While working on #662, it came to light that this should already be automated as part of our helm chart by including the appropriate config in the hub config file. However, when I tried to enable this for the Pangeo cluster, the service account (
staging-user-sa@{{ pangeo project ID }}.iam.gserviceaccount.com
) doesn't get created as expected. We need to work out why so we can automate this.Annoyingly, there's no error message to help debug - I know it's not working because I can see from the Google Cloud IAM page that the service account is not being created.
For now, requester pays access for Pangeo is manually enabled so this isn't a blocking issue.
Value / benefit
Automating this deployment will save us a lot of manual busywork :)
Implementation details
No response
Tasks to complete
two-eye-two-see-sandbox
. This will help us figure out if it's a bug due to the special-casing of Pangeo or if it's a more general broken piece of configUpdates
two-eye-two-see-sandbox
and the[email protected]
service account is not created there either. So this is a general problem.The text was updated successfully, but these errors were encountered: