Skip to content
This repository was archived by the owner on Jan 9, 2020. It is now read-only.

Upstreaming and pull request strategy for Spark on Kubernetes #441

Open
erikerlandson opened this issue Aug 17, 2017 · 9 comments
Open

Upstreaming and pull request strategy for Spark on Kubernetes #441

erikerlandson opened this issue Aug 17, 2017 · 9 comments

Comments

@erikerlandson
Copy link
Member

We'll be wanting a plan for presenting this project in PRs that are as digestible as possible by the Apache Spark reviewers upstream. We can use this issue to formulate a strategy

@erikerlandson
Copy link
Member Author

The RSS seems like one good candidate for a PR, to be presented "after" the kube-spark core. If so, then possibly RSS-related config might require temporary disabling for initial core PR.

@erikerlandson
Copy link
Member Author

The hadoop repo seems like another.

@erikerlandson
Copy link
Member Author

Shuffle service / D.A. ?

@foxish
Copy link
Member

foxish commented Sep 2, 2017

The candidate set of different and separable components, and a potential order in which we could upstream them is:

  1. Test infrastructure (minikube on jenkins)
  2. Scheduler Backend + Submission Client
  3. Dynamic allocation
  4. Resource staging server
  5. Kerberos support

(2) is the most complex of the lot. It mostly makes sense together, but we could break it down further if we absolutely must.

@foxish
Copy link
Member

foxish commented Sep 6, 2017

Following up on the discussion today in the meeting, the questions we need to answer are:

Test infrastructure

  • Discovery - What's the right place to run our integration tests? AMPLab Jenkins? Externally? - @ssuchter
  • Current resource requirements & running tests in parallel. - @mccheah
  • Can we run tests on an external k8s cluster (is this okay from the upstream perspective?) @ssuchter/@foxish to follow up.

Tag 2.2 branch

  • Merge fixes
  • Cut a bug-fix release and announce. (@kimoonkim)

Scheduler Backend + Submission Client

  • Steps
    • Submission orchestrator
    • Submission steps
    • Scheduler Backend
    • (@mcheah to assess if this makes sense for now, or if we more of a split)

cc @felixcheung

@mccheah
Copy link

mccheah commented Sep 7, 2017

It's also good to document the changes that we want to include in upstream that we haven't merged into branch-2.2-kubernetes yet, including:

Any other ones I'm missing?

@foxish
Copy link
Member

foxish commented Sep 20, 2017

@mccheah all of those have merged I think.

For the 2.3 code-freeze, this seems to me like a reasonable (maybe slightly ambitious) target -
I expect we can speed up after the first PR.

Thoughts? If we all agree, I'll post this on the JIRA.

@erikerlandson
Copy link
Member Author

Per 9/20 sig meeting, move docker images up to after PR2. Also, annotate that Pr3 (submission steps) will be partly component of PR2.

In general, I'd expect that specific submission steps would be added along with whatever components they're associated with. The basic submission step architecture would land with PR2.

@felixcheung
Copy link

re: #441 (comment)
how about SparkR support? PR # 507

asfgit pushed a commit to apache/spark that referenced this issue Nov 29, 2017
## What changes were proposed in this pull request?

This is a stripped down version of the `KubernetesClusterSchedulerBackend` for Spark with the following components:
- Static Allocation of Executors
- Executor Pod Factory
- Executor Recovery Semantics

It's step 1 from the step-wise plan documented [here](apache-spark-on-k8s#441 (comment)).
This addition is covered by the [SPIP vote](http://apache-spark-developers-list.1001551.n3.nabble.com/SPIP-Spark-on-Kubernetes-td22147.html) which passed on Aug 31 .

## How was this patch tested?

- The patch contains unit tests which are passing.
- Manual testing: `./build/mvn -Pkubernetes clean package` succeeded.
- It is a **subset** of the entire changelist hosted in http://github.com/apache-spark-on-k8s/spark which is in active use in several organizations.
- There is integration testing enabled in the fork currently [hosted by PepperData](spark-k8s-jenkins.pepperdata.org:8080) which is being moved over to RiseLAB CI.
- Detailed documentation on trying out the patch in its entirety is in: https://apache-spark-on-k8s.github.io/userdocs/running-on-kubernetes.html

cc rxin felixcheung mateiz (shepherd)
k8s-big-data SIG members & contributors: mccheah ash211 ssuchter varunkatta kimoonkim erikerlandson liyinan926 tnachen ifilonenko

Author: Yinan Li <[email protected]>
Author: foxish <[email protected]>
Author: mcheah <[email protected]>

Closes #19468 from foxish/spark-kubernetes-3.
asfgit pushed a commit to apache/spark that referenced this issue Dec 11, 2017
This PR contains implementation of the basic submission client for the cluster mode of Spark on Kubernetes. It's step 2 from the step-wise plan documented [here](apache-spark-on-k8s#441 (comment)).
This addition is covered by the [SPIP](http://apache-spark-developers-list.1001551.n3.nabble.com/SPIP-Spark-on-Kubernetes-td22147.html) vote which passed on Aug 31.

This PR and #19468 together form a MVP of Spark on Kubernetes that allows users to run Spark applications that use resources locally within the driver and executor containers on Kubernetes 1.6 and up. Some changes on pom and build/test setup are copied over from #19468 to make this PR self contained and testable.

The submission client is mainly responsible for creating the Kubernetes pod that runs the Spark driver. It follows a step-based approach to construct the driver pod, as the code under the `submit.steps` package shows. The steps are orchestrated by `DriverConfigurationStepsOrchestrator`. `Client` creates the driver pod and waits for the application to complete if it's configured to do so, which is the case by default.

This PR also contains Dockerfiles of the driver and executor images. They are included because some of the environment variables set in the code would not make sense without referring to the Dockerfiles.

* The patch contains unit tests which are passing.
* Manual testing: ./build/mvn -Pkubernetes clean package succeeded.
* It is a subset of the entire changelist hosted at http://github.com/apache-spark-on-k8s/spark which is in active use in several organizations.
* There is integration testing enabled in the fork currently hosted by PepperData which is being moved over to RiseLAB CI.
* Detailed documentation on trying out the patch in its entirety is in: https://apache-spark-on-k8s.github.io/userdocs/running-on-kubernetes.html

cc rxin felixcheung mateiz (shepherd)
k8s-big-data SIG members & contributors: mccheah foxish ash211 ssuchter varunkatta kimoonkim erikerlandson tnachen ifilonenko liyinan926

Author: Yinan Li <[email protected]>

Closes #19717 from liyinan926/spark-kubernetes-4.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants