Skip to content
This repository has been archived by the owner on Jul 5, 2021. It is now read-only.

Leverage GKE Workload Identity as Authentication for SecretStores #149

Open
james-mcgoodwin opened this issue Jan 13, 2021 · 2 comments
Open
Assignees

Comments

@james-mcgoodwin
Copy link

james-mcgoodwin commented Jan 13, 2021

Describe the solution you'd like
Google Kubernetes Engine incorporates an authentication mechanism called Workload Identity (WID) that seamlessly allows pods running under a Kubernetes Service Account (KSA) to interact with GCP as a Google Service Account (GSA).

This is useful because it removes the need to pre-populate a namespace with a secret containing a GSA token as authentication.

Instead, the k8s application is able to do work (such as fetching secrets) from GCP via the configured WID setup between that apps KSA and GSA.

What is the added value?
The benefit here is removing the need to manage sensitive google service account tokens and instead rely on the GCP WID authentication method to permit the SecretStore to fetch secrets it's authorized to access.

In the case of my company, we have multi-stage automation that produces first the GSA, KSA, WID and Secrets (via Terraform), and a second stage that produces the k8s namespace + app deployment (Via argocd and helm)

That automation makes it difficult to extract the GSA token and provide it as a Kubernetes Secret inline with the deployment of an application.

I'm investigating GoDaddy Kubernetes External Secrets as an alternate because the requirement to provide tokens is prohibitive in our case.

Aside: My findings for KES are less then satisfying, since it seems easy to accidentally provide access to ALL secrets. Where ESO can be secured on a more granular basis.

Give us examples of the outcome
For example, if we have three microservices, MS_A, MS_B, MS_C, each with:

  • it's own namespace (ms_a_namespace, etc)
  • it's own KSA (ms_a_sa)
  • it's own GSA (ms_a)
  • A rolebinding between GSA and KSA via WID
  • it's own secret in Google Secret Manager (GSM)

then I would want to deploy a SecretStore into each NS and rely on the Workload Identity to authenticate and fetch secret for MS_A into it's namespace with out granting it access to the secret for MS_B and without the need to deploy the GSA token.

So in this example, for MS_A, I would have the following:

  • IN GCP
    • A secret named 'ms-a-password'
    • A GCP role binding attached to the above GSA for the 'roles/iam.workloadIdentityUser' against the member: serviceAccount:myproject.svc.id.goog[ms_a_namespace/ms_a_sa]
  • IN GKE
    • A namespace named 'ms_a_namespace'
    • A KSA named 'ms_a_sa'

And with this in place, the ideal is that I would be able to submit a SecretStore manifest like this:

apiVersion: store.externalsecret-operator.container-solutions.com/v1alpha1
kind: SecretStore
metadata:
  name: ms-a-secretstore
  namespace: ms-a-namespace
spec:
  controller: staging
  store:
    type: gsm
    auth:
      workloadIdentity:
        serviceaccount: ms_a_sa
    parameters:
      projectID: myproject

And I would no longer provide the GSA token as a kubernetes secret.

But nothing would change with the ExternalSecret manifest. I would simply request the external secret named 'ms-a-password' from GSM

I could then have exactly the same above setup, except for microservice_b instead, and deploy this SecretStore into the 'ms_b_namespace':

apiVersion: store.externalsecret-operator.container-solutions.com/v1alpha1
kind: SecretStore
metadata:
  name: ms-b-secretstore
  namespace: ms-b-namespace
spec:
  controller: staging
  store:
    type: gsm
    auth:
      workloadIdentity:
        serviceaccount: ms_b_sa
    parameters:
      projectID: myproject

Which would permit that namespace to access the secret 'ms-b-password'.

But I would NOT be able to request the secret 'ms-a-password' with in this namespace because the ms-b GSA is not permitted to read it.
And vice-versa for ms-a.

Observations (Constraints, Context, etc):

Obviously this would only provide value in a GCP environment, but I hope it's a compelling idea.
One concern that strikes me here is which namespace the actual calls to GCP are coming from. ESO creates it's own NS for it's controller, 'externalsecret-operator-system'. WID breaks down if the API calls are being made from that namespace unless the origin namespace can use WID to fetch it's key and then provide that to the controller...hhmmm

@knelasevero
Copy link
Contributor

Thanks a lot for submitting this issue! I will bring it to our internal discussions to brainstorm some of its pros and cons. But in general I really like the idea. Like you say, there are already some blockers to make it possible to use WID in the way that you propose because of the controller being in the 'externalsecret-operator-system' namespace, but probably with enough thought something interesting can get out of this premisse!

I have a question for you: Would you have the time and would you be willing to help us implement/test/validate this? That would be awesome.

@james-mcgoodwin
Copy link
Author

I'd be willing to help test and validate, sure!

That said I'm not sure how well I'd be able to chip in on implementation. I've basically only edited/hacked one other golang program to interact with GSM before, and that only downloads to files in a pod. It doesn't interact with K8s controllers or apis.

I don't know enough about how to write a controller in k8s. And my ignorance means I have thoughts about how to do this, but no real intuition about what would/would-not work.

For example, I don't know if the API call to GSM can be spoofed to appear to come from a different namespace. I know that I hope that it cannot be spoofed. Otherwise what's the point of the NS isolation model?

The only other notion I had was a pod that lives in each namespace wishing to leverage ESO. This pod would use WID in that namespace to pull the GSA's token and write it into the NS as the secret that's already being ingested by the controller. That would keep NS isolation intact. But I have concerns about the lifespan of that fetched WID-endabled GSA token and how frequently it would have to be re-fetched.

amouat pushed a commit that referenced this issue May 26, 2021
feat(docs): add basic docs for vault
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants