Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Bring your own control plane ELB #2787

Merged

Conversation

dlipovetsky
Copy link
Contributor

@dlipovetsky dlipovetsky commented Sep 22, 2021

What type of PR is this?
/kind feature

What this PR does / why we need it:
Lets the user define the control plane ELB name, so that the AWSCluster controller can use an existing ELB.

If the ELB exists and does not have a tag that says it is owned by the cluster, then the controller treats the ELB as unmanaged: it does not reconcile the ELB's attributes, tags, or attached subnets.

If the ELB does not exist, the controller behaves as has in the past: It creates the ELB, marks the ELB as owned by the cluster, and treats it as managed: it reconciles the ELB's attributes, tags, and attached subnets.

If the user does not provide a name, the controller generates a name from the cluster name and writes it to the AWSCluster Spec.

Something else to note: If the user defines the name, but the ELB does not exist, then the controller creates an ELB with the user-defined name, and treats it as managed.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #2774

Special notes for your reviewer:
I have tried to keep changes to a minimum, but despite that, the PR is large. I find it easier to review the commits one-by-one.

Checklist:

  • squashed commits
  • includes documentation
  • adds unit tests
  • adds or updates e2e tests

Release note:

Adds support for using an existing Classic ELB as the control plane load balancer. This is an advanced feature.

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. kind/feature Categorizes issue or PR as related to a new feature. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-priority size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Sep 22, 2021
@dlipovetsky
Copy link
Contributor Author

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Sep 22, 2021
@dlipovetsky
Copy link
Contributor Author

/retest

@dlipovetsky
Copy link
Contributor Author

/retest

@dlipovetsky dlipovetsky changed the title User defined elb name feat: Bring your own control plane ELB Sep 23, 2021
api/v1alpha3/conversion.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 24, 2021
api/v1alpha3/conversion.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 28, 2021
Copy link
Member

@richardcase richardcase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! A few minor comments

api/v1alpha4/awsloadbalancerspec.go Outdated Show resolved Hide resolved
pkg/cloud/scope/cluster.go Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 29, 2021
@dlipovetsky
Copy link
Contributor Author

Another point of discussion:

CAPA has to be able to update the ELB's instances. I think it's also ok the let CAPA make other updates to the ELB.

However, as with other bring-your-own infrastructure, CAPA should not delete the ELB.

@dlipovetsky
Copy link
Contributor Author

In light of @richardcase's comments, I'd like to change the approach here: The AWSCluster controller should generate the ELB name and store it in the Spec. I considered this earlier, but found it awkward to implement with the existing ReconcileLoadbalancers abstraction.

@richardcase
Copy link
Member

In light of @richardcase's comments, I'd like to change the approach here: The AWSCluster controller should generate the ELB name and store it in the Spec. I considered this earlier, but found it awkward to implement with the existing ReconcileLoadbalancers abstraction.

This would be a similar approach to what we have taken elsewhere, i think this would be good.

@dlipovetsky
Copy link
Contributor Author

dlipovetsky commented Oct 4, 2021

I've identified a new problem: The AWSCluster controller deletes the control plane ELB when the it deletes the cluster resources. It should not delete a user-created control-plane ELB. The controller needs to know whether the ELB is user-created or CAPA-created.

If the new Name field is defined if and only if the ELB is user-created, the controller can infer that the ELB is user-created from the field. If, to allow future changes to the way we generate the name, we write the generated name to this field, then the controller can no longer infer whether the ELB is user-created.

I think we'll need to store the generated name and user-defined name separately, or add a boolean that indicates the name is generated. Maybe I can repeat the approach we used with the VPC: store the ID, but determine whether it's CAPA-created by reading the tags.

@richardcase
Copy link
Member

The controller needs to know whether the ELB is user-created or CAPA-create

I think we should be able to use the owned tag?

@dlipovetsky dlipovetsky force-pushed the user-defined-elb-name branch from 76d4b1d to cdd9ac6 Compare October 8, 2021 00:01
@richardcase
Copy link
Member

Good job on fixing the e2e failures so quick 🥇

I have no further comments on this change and the e2e are passing (here and here), so from my part:

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 29, 2021
For the API server ELB, the RegisterInstanceWithAPIServerELB function is
used.
Previously, the function did not populate the API server ELB tags. The
tags are now necessary to decide whether the API server ELB is managed
or not.
The function was already specialized to list ELBs owned by the AWS cloud provider;
the name is changed to reflect this.
Make the name reflect what the function does. Also add an explanatory comment.
@dlipovetsky
Copy link
Contributor Author

Unfortunately, I just got another merge conflict, I think with #2900. It's a bit silly, because it appears to be just whitespace.

I'm rebasing on main and force-pushing.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Nov 2, 2021
@dlipovetsky dlipovetsky force-pushed the user-defined-elb-name branch from b7185b9 to 138bcac Compare November 2, 2021 17:38
@k8s-ci-robot k8s-ci-robot removed lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Nov 2, 2021
@dlipovetsky
Copy link
Contributor Author

Re-running because I resolved the merge conflict and force-pushed:

/test pull-cluster-api-provider-aws-e2e
/test pull-cluster-api-provider-aws-e2e-eks

@randomvariable
Copy link
Member

/hold
for when the tests pass

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 2, 2021
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 2, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: randomvariable

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 2, 2021
@k8s-ci-robot
Copy link
Contributor

@dlipovetsky: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
pull-cluster-api-provider-aws-e2e 138bcac link false /test pull-cluster-api-provider-aws-e2e

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@randomvariable
Copy link
Member

randomvariable commented Nov 2, 2021

GPU test is flaking #2905

@randomvariable
Copy link
Member

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 2, 2021
@k8s-ci-robot k8s-ci-robot merged commit 461ef72 into kubernetes-sigs:main Nov 2, 2021
@k8s-ci-robot k8s-ci-robot modified the milestones: v1.1.0, v1.x Nov 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/networking Issues or PRs related to networking cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bring your own control plane ELB
6 participants