This repo provides a way to add support for many different DNS providers to Tanzu Platform. This is not a Tanzu officially supported method, it falls under the "bring your own DNS provider" category.
This is a wrapper around the OSS project external-dns. This allows for DNS(GSLB) entries to be created using any of the external DNS providers using domain bindings as a source. This works by deploying external-dns as a capabilty and configuring it for whichever provider is required, in the config it is also set to only watch CRDs this means it will not watch services, ingress etc. the second component of this is a controller that runs in a space and requires the external-dns capability. this controller watches spaces for domain bindings and creates the DnsEndpoint
CRs in the space based on those domain bindings. Those CRs are then watched by external-dns and it handles creating/updating/deleting the entries in the provider.
IMPORTANT: this is only meant to be used in a space with 1 replica and with a cluster group that only has 1 cluster, see more details in the FAQ
Due to limitations in TPK8s today in order to add a custom package repo that contains custom capabilties a workaround is needed. These instructions outline that workaround. This workaround is only done once by the platform engineer, once the repo is added everything can be done through TPK8S normally. This will need to be done per cluster group.This workaround will allow for the pkgr to automatcally be installed on the clustergroup, this works around the lack of package repo syncing today and prevents the user from having to manually create pkgrs on indiviual clusters. The workaround does the following:
- adds the package repo to the TPK8s project
- adds the package repo and to the tpk8s cluster group
This sets up the the project to have access to the custom pkgr so that it will show up in the UI when looking for capabilties
- add the package repo to the project
tanzu project use <your-proj>
export KUBECONFIG=~/.config/tanzu/kube/config
kubectl apply -f tpk8s-resources/tpk8s-dns-repo.yml
This is needed becuase we need the cluster group to have access to to the pkgr.
** make sure you cluster group only contains 1 cluster**
- add pkgr to the cluster group
tanzu ops clustergroup use <your-cg>
export KUBECONFIG=~/.config/tanzu/kube/config
kubectl apply -f tpk8s-resources/tpk8s-dns-repo.yml
in this example we will use azure. full steps found here
copy the values-example.yml to capability-values.yml and update the contents. This values file is made to allow any fo the supported providers in the format shown here and here with azure as the exmaple.then run the below command
tanzu ops clustergroup use <your-cg>
export KUBECONFIG=~/.config/tanzu/kube/config
ytt -f templated-resources/external-dns-values.yml --data-values-file capability-values.yml | kubectl apply -f-
This can be done through the UI or the api. The steps below use the cli/api so that they can eb easily reproduced. This assumes you already have an availability target.
- install the capability on the cluster group
tanzu ops clustergroup use <your-cg>
export KUBECONFIG=~/.config/tanzu/kube/config
k apply -f tpk8s-resources/dns-capability.yml
- create a profile for the tpk8s-dns controller
tanzu project use <your-project>
export KUBECONFIG=~/.config/tanzu/kube/config
k apply -f tpk8s-resources/profile.yml
- create a space using the profile. you will need to update the availability target in the yaml below as well as any profiles you need
tanzu project use <your-project>
export KUBECONFIG=~/.config/tanzu/kube/config
k apply -f tpk8s-resources/space.yml
- add egress, be sure to update this if you are using tpsm
tanzu space use your-space
export KUBECONFIG=~/.config/tanzu/kube/config
k apply -f tpk8s-resources/egress.yml
- connect to your project and the space that was previsouly created
tanzu project use <project>
tanzu space use tpk8s-dns-controller-space
- copy the
templated-resources/secret-example.yml
into the.tanzu/config
directory and rename itsecret.yml
- Update all of the values in the
secret.yml
. If you are running TPSM be sure to update the TPSM specific field and remove the saas ones from the secret. tanzu deploy
You can check the logs on the controller pod in the cluster to make sure it is working along with the external-dns logs.
When running self managed this can be deployed in the self managed control plane cluster. If this approach is not preferred the other approach can still be used with TPSM. This approach could also be used for a generic install on a non managed cluster.
all of the steps below should be run against the TPSM cluster
copy the values-example.yml to capability-values.yml and update the contents. This values file is made to allow any fo the supported providers in the format shown here and here with azure as the exmaple.then run the below command
ytt -f templated-resources/external-dns-values.yml --data-values-file capability-values.yml | yq '.stringData.["values.yml"]' > helm-values.yml
k create ns external-dns
helm install external-dns oci://registry-1.docker.io/bitnamicharts/external-dns -f helm-values.yml -n external-dns
unlike the on platform install, this install will simply use a k8s deployment to run the app.
- copy the
templated-resources/secret-example.yml
into thecontroller-deploy
directory and rename itsecret.yml
- Update all of the values in the
secret.yml
- deploy the app
This is due to a limitation with external dns and the way TPk8s expects to work. external DNS uses a txt-owner-id to detrmine which records it owns. wehn deployed into a cluster group as a capability this owner is the same between clusters. this means every cluster ijn the cluster group is trying to update the same domain. when a cluster is running external DNS but does not have a space replica on it, it will think there are no DNS entries and remove them. Thsi will create a constant conflict between clusters tryign to delete and update records. You could have the same number of space replicas as clusters and it would not create this issue, however that could lead to other issues if a space re-schedule, etc. for this reason the safest way to deploy this is to ensure the cluster group only has 1 cluster and the space only has 1 replica.
This takes a careful appropach in order to not cause DNS downtime. First edit your values file for the external DNS capability and set the policy to upsert-only
this will prevent records from being deleted. Next create a new cluster in your cluster group, becuase it's in upsert only mode the new cluster's external dns controller will not try to delete records. next scale your space to 2 replicas. at this point both clusters should be sycning the records and there should be no errors. you can now cordon and drain the old cluster. lastly update the policy back to sync
.
This is an underlying limitation of external-dns. It does not support running multiple replicas therefore it is not recommended to run this with multiple replicas.
It is recommended to run this with rollingUpdate
as the strategy for spaces. If this is deployed with external dns using policy: sync
there is a possibility of having a record deleted and the recreate cuasing minimal downtime. If this is run with policy: upsert-only
it will not delete records so there will not be an issue of potetnial record recreation, however records will not be deleted and deletes will need to be handled manually.
Certain cloud providers like EKS use hostnames for their address when creating a service type LB. This is an issue becuase it is not possible to create a single record that has hostnames and IP addresses. in some DNS providers this can we worked around by using CNAMEs , however some providers do not allow multiple entries with the same DNS fqdn(Azure for exmaple). due to this limitation and lack of consistency between providers this controller can only work aross providers that use the same underlying format for thier addresses. mostly this is an issue with combining EKS and any other provider in the same space. If you are using EKS it's better to use the native route53 integration.