Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] add support support for VMC on AWS #491

Closed
wants to merge 3 commits into from

Conversation

yastij
Copy link
Member

@yastij yastij commented Aug 7, 2019

What this PR does / why we need it: this brings support of VMC on AWS to capv and enhances our HA support when running VMC

Which issue(s) this PR fixes: Fixes #468

Special notes for your reviewer:

what is still missing:

  • updating the manifest generation to include the AWS credentials secret (secret object + volumeMount for the manager)
  • add an e2e test coverage for this

Release note:

* Support HA clusters on VMware Cloud (VMC) on AWS

@k8s-ci-robot
Copy link
Contributor

Welcome @yastij!

It looks like this is your first PR to kubernetes-sigs/cluster-api-provider-vsphere 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/cluster-api-provider-vsphere has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 7, 2019
@k8s-ci-robot k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Aug 7, 2019
@yastij
Copy link
Member Author

yastij commented Aug 7, 2019

/assign @andrewsykim @akutz

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: yastij
To complete the pull request process, please assign akutz
You can assign the PR to them by writing /assign @akutz in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@yastij
Copy link
Member Author

yastij commented Aug 7, 2019

commits should be reviewable separately

// VmwareCloudSpec describes the supported cloud providers
type VmwareCloudSpec struct {
// AwsProvider specifies the information needed to run VMC on AWS
AwsProvider *AwsProviderSpec `json:"awsProviderSpec"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i could be the only me think that name of AwsProviderSpec is confusing, because there is VsphereProviderSpec, but those two are not on the same level, AwsProviderSpec is a sub-configure spec, i would like to hear how others think, whether name like AwsSpec make sense, thanks.

@@ -77,6 +77,27 @@ type VsphereClusterProviderSpec struct {
// ClusterConfiguration holds the cluster-wide information used during a
// kubeadm init call.
ClusterConfiguration kubeadmv1beta1.ClusterConfiguration `json:"clusterConfiguration,omitempty"`

// VmwareCloud describes the supported cloud providers to run VMC
VmwareCloud *VmwareCloudSpec `json:"vmwareCloud,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will be nice to consist with VmwareCloud-> VMwareCloud

@akutz
Copy link
Contributor

akutz commented Aug 7, 2019

Hi @yastij,

I haven't had time for a full review, but I'd like to see the PR's types be refactored to align with PR #471 -- where the types are in a new package, similar to https://github.com/akutz/cluster-api-provider-vsphere/tree/feature/cloud-provider-config/pkg/apis/vsphere/v1alpha1/cloud. You can then just deepcopygen the new package. For example:

./apis/vpshere/v1alpha1/cloud
|
|-- vmc
     |
     |-- aws

What are your thoughts on this? I think keeping the root API folder from getting cluttered would be a good thing, but maybe I'm missing something here.

@yastij
Copy link
Member Author

yastij commented Aug 8, 2019

@akutz - I'm fine with this

@yastij
Copy link
Member Author

yastij commented Aug 8, 2019

/retest

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 9, 2019
yastij added 3 commits August 12, 2019 14:53
Signed-off-by: Yassine TIJANI <[email protected]>
Signed-off-by: Yassine TIJANI <[email protected]>
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 12, 2019
@yastij
Copy link
Member Author

yastij commented Aug 12, 2019

@akutz - rebased and updated types location/deepcopygen, commits should be reviewable independently

@yastij
Copy link
Member Author

yastij commented Aug 12, 2019

/retest

@k8s-ci-robot
Copy link
Contributor

@yastij: The following test failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-cluster-api-provider-vsphere-e2e 2a4b3d4 link /test pull-cluster-api-provider-vsphere-e2e

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@moshloop
Copy link

/hold
I am opposed to merging this PR in its current design.

VMC on AWS is not the primary use case for CAPV, or even a use case at all. People who are running in AWS will use CAPA, not CAPV on VMC.

CAPV's primary use case is running inside a non-public cloud on vSphere and that is what it should be designed to support.

This needs to be redesigned to cater for arbitrary load balancers via plugin or extension and/or support multiple load balancer types natively of which F5 is the most common.

/cc @thebsdbox

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 12, 2019
@k8s-ci-robot
Copy link
Contributor

@moshloop: GitHub didn't allow me to request PR reviews from the following users: thebsdbox.

Note that only kubernetes-sigs members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/hold
I am opposed to merging this PR in its current design.

VMC on AWS is not the primary use case for CAPV, or even a use case at all. People who are running in AWS will use CAPA, not CAPV on VMC.

CAPV's primary use case is running inside a non-public cloud on vSphere and that is what it should be designed to support.

This needs to be redesigned to cater for arbitrary load balancers via plugin or extension and/or support multiple load balancer types natively of which F5 is the most common.

/cc @thebsdbox

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@moshloop
Copy link

@thebsdbox

@yastij
Copy link
Member Author

yastij commented Aug 12, 2019

VMC on AWS is not the primary use case for CAPV, or even a use case at all. People who are running in AWS will use CAPA, not CAPV on VMC.

How does CAPA support provisioning machines against vsphere ?

CAPV's primary use case is running inside a non-public cloud on vSphere and that is what it should be designed to support.

Is there any specifics that makes you think that we should not use CAPV to provision Kubernetes clusters on VMC ?

This needs to be redesigned to cater for arbitrary load balancers via plugin or extension and/or support multiple load balancer types natively of which F5 is the most common.

I'm not against getting the API right, the plugin mechanism can get be done once we start extending the support for various load balancers

@akutz
Copy link
Contributor

akutz commented Aug 12, 2019

VMC on AWS is not the primary use case for CAPV, or even a use case at all.

I'm not sure I understand. VMC on AWS is vanilla vSphere with access to a few additional features. Datacenters are able to leverage VMC on AWS as "hosted on-prem".

People who are running in AWS will use CAPA, not CAPV on VMC.

Running in AWS has no relationship to running in VMC on AWS. VMC on AWS is vSphere on bare-metal servers in AWS cages.

This needs to be redesigned to cater for arbitrary load balancers via plugin or extension and/or support multiple load balancer types natively of which F5 is the most common.

The difference is that VMC on AWS is a service offered by VMware and thus has a relationship to CAPV, whereas F5 does not.

VMware also funds the test infrastructure used to validate CAPV -- test infrastructure that happens to run on VMC on AWS. This provides a funded test bed with SLA-guaranteed uptime against which this PR can be developed and tested.

Not to mention this PR finally enables e2e tests against CAPV-deployed clusters in an HA configuration.

Finally, this PR will actually simplify the test workflow for CAPV as jobs running on Google's Prow clusters will be able to bootstrap Kubernetes clusters with CAPV directly from the Prow cluster, eliminating the need for a jump host.

As for a general LB solution for HA clusters, I'm not opposed to a more pluggable framework -- sk8 supports LVS and VMC ELB for HA and external access. However, this PR is low-hanging fruit and solves several problems today while still able to be refactored into a pluggable model tomorrow.

To that end, I disagree that this PR is out-of-scope, and in fact I commend @yastij for taking the time to work on this. I am not going to remove the hold, but that will be my recommendation to the community.

@timothysc
Copy link
Member

VMC on AWS is not the primary use case for CAPV, or even a use case at all. People who are running in AWS will use CAPA, not CAPV on VMC. CAPV's primary use case is running inside a non-public cloud on vSphere and that is what it should be designed to support. This needs to be redesigned to cater for arbitrary load balancers via plugin or extension and/or support multiple load balancer types natively of which F5 is the most common.

So there is definitely a use case for CAPV on VMC or else the folks here wouldn't be doing it.

This needs to be redesigned to cater for arbitrary load balancers via plugin or extension and/or support multiple load balancer types natively of which F5 is the most common.

I think that's a fine idea to bring your own LB(F5) and we should probably enable the workflow. Is there an issue open for it? That being said, I don't think the VMC case should be conflated unless it weirdly affects the api, but TBH the PR was so large it was difficult to sort through quickly.

@moshloop
Copy link

As discussed in the CAPV meeting yesterday:

  • The mutually agreed use-case for VMC ELB today is to eliminate the need for a bastion host and enable HA testing
  • This PR will have some preliminary work at the API / data model to make the load balancer concept more abstract
  • This PR can be merged once ready to enable testing of control plane HA functionality with the understanding that it may be reverted/refactored as part of the longer-term issue

@yastij @akutz - Please correct me if I missed/misunderstood anything?

@akutz
Copy link
Contributor

akutz commented Aug 22, 2019

Hi @yastij,

Let's hold this for post-v1a2. I have another item, #514, that has higher priority and needs someone to take it if @codenrhoden would like to divide up some of the issues currently assigned to him.

@yastij
Copy link
Member Author

yastij commented Aug 22, 2019

@akutz - let me know, if you need help with something for v1a2 migration

@akutz
Copy link
Contributor

akutz commented Aug 27, 2019

Hi @yastij,

What is the status of this PR?

@yastij
Copy link
Member Author

yastij commented Aug 27, 2019

I'm migrating this to v1a2, this should be done by the end of the week

@yastij
Copy link
Member Author

yastij commented Aug 29, 2019

closing this in favor of #538

@yastij yastij closed this Aug 29, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add support for VMC
7 participants