Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster Autoscaler / Cluster API Integration #609

Closed
DirectXMan12 opened this issue Aug 1, 2018 · 43 comments
Closed

Cluster Autoscaler / Cluster API Integration #609

DirectXMan12 opened this issue Aug 1, 2018 · 43 comments
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team

Comments

@DirectXMan12
Copy link
Contributor

DirectXMan12 commented Aug 1, 2018

Feature Description

  • One-line feature description (can be used as a release note): Convert the cluster autoscaler to make use of the cluster API for controlling node creation/deletion.
  • Primary contact (assignee): @enxebre
  • Responsible SIGs: SIG Autoscaling
  • Design proposal link (community repo):
  • Link to e2e and/or unit tests:
  • Reviewer(s) - (for LGTM) recommend having 2+ reviewers (at least one from code-area OWNERS file) agreed to review. Reviewers from multiple companies preferred:
  • Approver (likely from SIG/area to which feature belongs):
  • Feature target (which target equals to which milestone):
    • Alpha release target (x.y)
    • Beta release target (x.y)
    • Stable release target (x.y)
@DirectXMan12
Copy link
Contributor Author

DirectXMan12 commented Aug 1, 2018

@kubernetes/sig-autoscaling-feature-requests it's not clear to me whether or not we need to precisely track this with the kubernetes release cluster autoscaler doesn't always release at the exact same cadence as Kubernetes proper (EDIT: @MaciekPytel correctly pointed out that the releases match -- I think I was confused out some minor releases), but I wanted to get this filed just in case, and so that we can easily track it.

cc @derekwaynecarr

See also https://github.com/kubernetes/autoscaler/releases

I'm ultimately willing to sponsor this, but I'd like to get exact names from the CA subproject for reviewers/approvers (we discussed a bit in the past meeting and there was approval to begin looking in this direction).

@k8s-ci-robot k8s-ci-robot added sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. kind/feature Categorizes issue or PR as related to a new feature. labels Aug 1, 2018
@MaciekPytel
Copy link

Actually, Cluster Autoscaler minor releases match Kubernetes releases 1-1. Cluster Autoscaler 1.4 will go out with k8s 1.12, 1.5 with 1.13, etc.

From Cluster Autoscaler side the approver should probably be @mwielgus and reviewers @mwielgus and me.

We probably need an approver/reviewer from Cluster API side as well. As discussed on the sig meeting integration with CA would require changes to Cluster API (most importantly ability to delete a specific node, as opposed to just resizing machineset; more changes would be required for additional features like scale-to-0, but those are not critical for initial implementation).

@justaugustus justaugustus added this to the v1.12 milestone Aug 4, 2018
@justaugustus
Copy link
Member

Thanks for the update. I've added this to the 1.12 tracking sheet.

/assign @enxebre @DirectXMan12
/stage alpha
cc: @kacole2 @wadadli @robertsandoval @rajendar38

@k8s-ci-robot k8s-ci-robot added the stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status label Aug 4, 2018
@k8s-ci-robot
Copy link
Contributor

@justaugustus: GitHub didn't allow me to assign the following users: enxebre.

Note that only kubernetes members and repo collaborators can be assigned.
For more information please see the contributor guide

In response to this:

Thanks for the update. I've added this to the 1.12 tracking sheet.

/assign @enxebre @DirectXMan12
/stage alpha
cc: @kacole2 @wadadli @robertsandoval @rajendar38

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@justaugustus justaugustus added the tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team label Aug 4, 2018
@zparnold
Copy link
Member

Hey there! @DirectXMan12 I'm the wrangler for the Docs this release. Is there any chance I could have you open up a docs PR against the release-1.12 branch as a placeholder? That gives us more confidence in the feature shipping in this release and gives me something to work with when we start doing reviews/edits. Thanks! If this feature does not require docs, could you please update the features tracking spreadsheet to reflect it?

@DirectXMan12
Copy link
Contributor Author

@enxebre is the right person to ask, as the primary contact

@zparnold
Copy link
Member

Thanks Solly! @enxebre Could you let me know what the docs status is?

@derekwaynecarr
Copy link
Member

The feature is still under design/development and will have to track post 1.12.

@justaugustus
Copy link
Member

justaugustus commented Aug 29, 2018

@derekwaynecarr -- thanks for the update. Pulling this from the 1.12 milestone.

@justaugustus justaugustus added tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team and removed tracked/yes Denotes an enhancement issue is actively being tracked by the Release Team labels Aug 29, 2018
@justaugustus justaugustus removed this from the v1.12 milestone Aug 29, 2018
@ingvagabund
Copy link
Contributor

WIP proposal: kubernetes/community#2653
Definitely not the final version. Still collection and documenting all the findings and ideas.

@vikaschoudhary16
Copy link

cc @vikaschoudhary16

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 13, 2018
@aleksandra-malinowska
Copy link

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 13, 2018
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 13, 2019
@kron4eg
Copy link

kron4eg commented Apr 4, 2019

/remove-lifecycle stale

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 31, 2019
@mitchellmaler
Copy link

/remove-lifecycle stale

Cluster autoscaler and the cluster api integration still needs to be completed. It seems like this was initially looked at with a sponsor but then dropped off. Maybe this needs a bit more planning since each release it has been skipped over.

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 31, 2019
@kikisdeliveryservice
Copy link
Member

Hey there @mitchellmaler @MaciekPytel -- 1.18 Enhancements shadow here. I wanted to check in and see if you think this Enhancement will be graduating to [alpha|beta|stable] in 1.18?
The current release schedule is:
Monday, January 6th - Release Cycle Begins
Tuesday, January 28th EOD PST - Enhancements Freeze
Thursday, March 5th, EOD PST - Code Freeze
Monday, March 16th - Docs must be completed and reviewed
Tuesday, March 24th - Kubernetes 1.18.0 Released
To be included in the release, this enhancement must have a merged KEP in the implementable status. The KEP must also have graduation criteria and a Test Plan defined.
If you would like to include this enhancement, once coding begins please list all relevant k/k PRs in this issue so they can be tracked properly. 👍
We'll be tracking enhancements here: http://bit.ly/k8s-1-18-enhancements
Thanks!

@kikisdeliveryservice
Copy link
Member

As a reminder @mitchellmaler @MaciekPytel :

Tuesday, January 28th EOD PST - Enhancements Freeze

Enhancements Freeze is in 7 days. If you seek inclusion in 1.18 please update as requested above.

Thanks!

@elmiko
Copy link
Contributor

elmiko commented Feb 13, 2020

hi, i brought up this issue at the last cluster-api meeting as i would like to help drive it forward. i am still coming up to speed on the progress, but my understanding is there is a documentation effort that still needs to be addressed?

any updates are greatly appreciated =)

@mooperd
Copy link

mooperd commented Feb 29, 2020

+1

@elmiko
Copy link
Contributor

elmiko commented Apr 6, 2020

just wanted to add an update here, the initial work to integrate cluster-api into the autoscaler has been completed. see kubernetes/autoscaler#1866

we are now working to add more unit test coverage and increasing the end-to-end tests. we also have plans for code improvements and clean ups, as well as landing a few early bug fixes.

@kikisdeliveryservice
Copy link
Member

Hi @DirectXMan12 @mitchellmaler @MaciekPytel,

1.19 Enhancements shadow here. I wanted to check in and see if you think this Enhancement will be graduating in 1.19?

In order to have this part of the release:

The KEP PR must be merged in an implementable state
The KEP must have test plans
The KEP must have graduation criteria.

The current release schedule is:

Monday, April 13: Week 1 - Release cycle begins
Tuesday, May 19: Week 6 - Enhancements Freeze
Thursday, June 25: Week 11 - Code Freeze
Thursday, July 9: Week 14 - Docs must be completed and reviewed
Tuesday, August 4: Week 17 - Kubernetes v1.19.0 released

Please let me know and I'll add it to the 1.19 tracking sheet (http://bit.ly/k8s-1-19-enhancements). Once coding begins please list all relevant k/k PRs in this issue so they can be tracked properly. 👍

Thanks!

@kikisdeliveryservice
Copy link
Member

As a reminder, enhancements freeze is tomorrow May 19th EOD PST. In order to be included in 1.19 all KEPS must be implementable with graduation criteria and a test plan.

Thanks.

@elmiko
Copy link
Contributor

elmiko commented May 18, 2020

@mwielgus @MaciekPytel i'm curious if there is anything i need to do here?

i'm not sure if we have a written plan for the testing portion of this, we do have unit tests in place and i have a plan to improve e2e around this. should i record this information somewhere?

@kikisdeliveryservice
Copy link
Member

Unfortunately the deadline for the 1.19 Enhancement freeze has passed. For now this is being removed from the milestone and 1.19 tracking sheet. If there is a need to get this in, please file an enhancement exception.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2020
@elmiko
Copy link
Contributor

elmiko commented Aug 18, 2020

the initial integration has happened and we are now working on improvements. happy to take any action to help close this issue.
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 18, 2020
@kikisdeliveryservice
Copy link
Member

Hi @mwielgus @MaciekPytel

Enhancements Lead here. Any plans for this in 1.20?

Thanks,
Kirsten

@kikisdeliveryservice
Copy link
Member

Hi @mwielgus @MaciekPytel

Following up: 1.20 Enhancements Freeze is October 6th. Could you let us know if you have plans for 1.20? I don't see a KEP linked.

To be included in the milestone:
The KEP must be merged in an implementable state
The KEP must have test plans
The KEP must have graduation criteria

Best,
Kirsten

@rptaylor
Copy link

@elmiko is there a KEP written up for this?

@elmiko
Copy link
Contributor

elmiko commented Sep 29, 2020

@rptaylor i don't think there is. i was introduced to this topic from this issue and the work we've done to integrate capi/autoscaler. should we write a kep to help close out this issue?

@MaciekPytel
Copy link

I'm don't think we should keep track of this effort here at all. This is a feature of Cluster Autoscaler and not Kubernetes. While Cluster Autoscaler generally tracks Kubernetes releases it doesn't follow the same release process or release schedule. All past proposals/designs for Cluster Autoscaler were discussed via issue in kubernetes/autoscaler repo and the design was merged to https://github.com/kubernetes/autoscaler/tree/master/cluster-autoscaler/proposals.

Finally, provider integration with CA is just a matter of implementing an interface defined by Cluster Autoscaler. We haven't required any kep-like document prior to implementing any other provider integration and I don't think there is any particular need to do so (unless the integration would require some changes in Cluster Autoscaler itself).

I think we should just close this issue.

@elmiko
Copy link
Contributor

elmiko commented Sep 29, 2020

Finally, provider integration with CA is just a matter of implementing an interface defined by Cluster Autoscaler. We haven't required any kep-like document prior to implementing any other provider integration and I don't think there is any particular need to do so (unless the integration would require some changes in Cluster Autoscaler itself).

I think we should just close this issue.

+1

@kikisdeliveryservice
Copy link
Member

Any update on closing this issue?

@elmiko
Copy link
Contributor

elmiko commented Oct 7, 2020

it seems like we have agreement about closing this issue.
/close

@k8s-ci-robot
Copy link
Contributor

@elmiko: Closing this issue.

In response to this:

it seems like we have agreement about closing this issue.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. sig/autoscaling Categorizes an issue or PR as relevant to SIG Autoscaling. stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status tracked/no Denotes an enhancement issue is NOT actively being tracked by the Release Team
Projects
None yet
Development

No branches or pull requests