Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Use CAPI for provisioning clusters #424

Merged
merged 338 commits into from
Sep 26, 2023
Merged

feat: Use CAPI for provisioning clusters #424

merged 338 commits into from
Sep 26, 2023

Conversation

maciaszczykm
Copy link
Member

@maciaszczykm maciaszczykm commented Aug 1, 2023

Summary

Labels

Test Plan

Checklist

  • If required, I have updated the Plural documentation accordingly.
  • I have added tests to cover my changes.
  • I have added a meaningful title and summary to convey the impact of this PR to a user.
  • I have added relevant labels to this PR to help with categorization for release notes.

maciaszczykm and others added 30 commits July 20, 2023 13:57
@zreigz zreigz added enhancement New feature or request and removed breaking-change This pull request may break an installation on update roadmap labels Sep 26, 2023
@maciaszczykm maciaszczykm added breaking-change This pull request may break an installation on update roadmap labels Sep 26, 2023
@zreigz zreigz merged commit afbd9e6 into main Sep 26, 2023
10 of 16 checks passed
@maciaszczykm maciaszczykm deleted the cluster-api-simple-test branch September 26, 2023 07:23
michaeljguarino added a commit that referenced this pull request Aug 28, 2024
* Bump cluster-api-migration

* bump migrator

* Bump cluster-api-migration

* bump migrator

* bump migrator

* bump migrator

* Update GCP migration config

* optimize imports

* Remove --cluster-api flag

* update google bootstrap flags

* Fix deploy logic

* bump migrator

* update destroy bootstrap flags for google provider

* Check if cluster exists

* update destroy steps

* Fix deploy

* Add logging

* Add missing new line

* Fix log types

* Add client ID and secret to init survey

* remove cluster resources during destroy

* Fix wait command

* Remove plural clusters watch command

* Run go mod tidy

* Fix unit tests

* Print step numbers for bootstrap and migration

* Remove plural cluster watch command and some unused code

* Remove build step and update descriptions for CAPI deploy

* Refactor deploy and migration steps

* Refactor destroy steps

* Add destroy logs

* Refactor

* Move CAPI related logic from cmd to pkg

* Extract common code

* Move checks

* Fix minor import issue

* Cleanup

* Remove unused flag
Remove duplicated command

* Minor improvements

* Add TODO

* add post install step

* Update cluster readiness check

* Fix merge conflicts

* Update migration configuration for gcp

* Export execute steps function

* Refactor

* Refactor migration

* Add tests for common functions

* Improve GCP preflight checks

* Add tests for migration functions

* Update messaging

* add kind provider

* Raise destroy timeout

* Refactor cilium.go

* Fix resource group and storage account name validation

* Add command to check if chart is installed

* save kubeconfig

* add kind configuration

* fix kind configuration

* fix docker destroy

* normilize kind

* update e2e test

* update github action

* bump kind action

* create bootstrap namespace

* create bootstrap namespace

* add extra debug

* do not run migrate when cluster already migrated

* read sa email from credentials file

* add vendor dir to gitignore

* fix import cycle

* add PLURAL_DISABLE_MP_TABLE_VIEW env for machine pools view

* remove bootstrap operator dependencies

* cilium update

* refactor

* split e2e tests

* change name

* Refactor e2e workflows

* distinguish between regular and cluster api

* distinguish between regular and cluster api - fix

* distinguish between regular and cluster api - improvement

* distinguish between regular and cluster api - improvement

* distinguish between regular and cluster api - improvement

* add e2e test for cluster api

* enable list view for destroy

* add e2e test to check installed packages

* fix linter

* Update github.com/gin-gonic/gin to avoid CVE

* Bump dependencies

* Read Go version from go.mod in CI

* Bump dependencies

* improve error handling for deoploy/destroy cluster

* e2e update machine pool

* Refactor storage account code

* Fixes

* Fix unit tests

* Fix kind delete

* remove role permissions check for gcp SA and use local CLI ADC for migration and bootstraping

* fetch AvailabilityZones

* fix unit test

* Use Microsoft Graph SDK to create service principal and get client ID and secret

* set bootstrapMode flag for gcp during the bootstrap phase

* fix fetching zones

* Add proper role assignment to Azure service principal

* fix execute not showing error and add workaround for tf value templating issue

* Minor improvements

* read gcp credentials from adc file

* Fix client ID

* change migrate to run deploy at the end and run gcp in bootstrapMode during migrate

* update gcp permissions check

* set azure bootstrap mode flag

Signed-off-by: David van der Spek <[email protected]>

* Add commit flag at the end of running migrate (#436)

It's very likely a large number of users will forget to manage their git, we should just remove that possibility w/ this.

* do not use bootstrap mode for the gcp migration

* improve gcp permissions check messaging

* Fix typo

* Enable OIDC issuer for Azure clusters

* add some todo comments

Signed-off-by: David van der Spek <[email protected]>

* Create temporary service principal with password during deploy and destroy

* Refactor

* e2e update machine pool version

* Fix destroy

* update bootstrap step building logic

* add plural build-values REPO

* init bubbletea tui

* revert bootstrap step changes

* Resolve Helm issue

* Extract methods from bootstrap steps

* Fix destroy

* Modify aws auth configmap manually to solve migration chicken-egg (#437)

* Modify aws auth configmap manually to solve migration chicken-egg

This allows us to reusably modify the aws-auth configmap for eks from the client which should help resolve some migrration-time issues

* add to migrate steps

* Add secret list and create funcs

* Add kube initializer with context

* add feature flag for CAPI stuff

* fix build

* Add kube initializer with context

* set aws credentials

* cleanup build values command

* use dynamic credentials for GCP without storing them on the repo

* lint fix

* Refactor

* Rename file

* allow overriding enable field of helm modules

* Fix var name

* Simplify migration

* Restore uninstall azure-identity package step

* update gcp permissions check name

* fix nil pointer error when listing uninstalled package

* improve fetching AZs

* bump migrator version (#440)

Signed-off-by: David van der Spek <[email protected]>

* fix gcp provider name

* remove credentials

* Properly normalize Google -> GCP provider name and add migration step to update google provider name to gcp

* update go.sum

* make genmock

* Fix executor println (#443)

This was always saying "actionName <app>" instead of the passed action name.

* bump migrator

* small refactor

* Bump migrator version

* fix null replacment

* Deprecate values.yaml migration

* bump migrator

* Fix Azure destroy after migration

* Refactor step filtering

* Fix Azure identity bug

* add posthog feature call timeout and fix caching

* cleanup some steps

* Switch google to gcp during init

* Update messaging for GCP

* bump migrator

* update go.sum

* fix linters

* ci: ensure docker buildx removes the running nodes (#448)

Signed-off-by: David van der Spek <[email protected]>

* Add semver validation for required bootstrap tf/helm modules on migration (#445)

There are now some requirements for performing a migration tied to our helm/tf.  This will at least guarantee they're installed at migrate time.

* remove default values from migration values.yaml

* go mod tidy

* update AZs during migration

* disable external-dns and plural-certmanager-webhook

* Do not delete bootstrap cluster on failed deploy

* fix disabling plural-certmanager-webhook

Signed-off-by: David van der Spek <[email protected]>

* also disable external dns on gcp and azure

Signed-off-by: David van der Spek <[email protected]>

* Update step handling

* Add retry mechanism

* Fix step numbering

* Fix unit tests

* Further improvements

* Use map to store provider tags

* add move state backup and restore to capi deploy

* Further improvements

* Fix OIDC issuer step

* Fix typo

* add initial step confirm support

* move capi backup to .plural dir and add multi-cluster backup support

* Remove tui package

* add conditional recovery steps when cluster issues are detected

---------

Signed-off-by: David van der Spek <[email protected]>
Co-authored-by: Lukasz Zajaczkowski <[email protected]>
Co-authored-by: Sebastian Florek <[email protected]>
Co-authored-by: David van der Spek <[email protected]>
Co-authored-by: michaeljguarino <[email protected]>
Co-authored-by: David van der Spek <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change This pull request may break an installation on update enhancement New feature or request roadmap
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants