Skip to content

ConfigMap and CRD based approach for managing the Spark clusters in Kubernetes and OpenShift.

License

Notifications You must be signed in to change notification settings

sharebear/spark-operator

 
 

Repository files navigation

spark-operator

Build status License

{ConfigMap|CRD}-based approach for managing the Spark clusters in Kubernetes and OpenShift.

This operator uses abstract-operator library.

Watch the full asciicast

How does it work

UML diagram

Quick Start

Run the spark-operator deployment:

kubectl apply -f manifest/operator.yaml

Create new cluster from the prepared example:

kubectl apply -f examples/cluster.yaml

After issuing the commands above, you should be able to see a new Spark cluster running in the current namespace.

kubectl get pods
NAME                               READY     STATUS    RESTARTS   AGE
my-spark-cluster-m-5kjtj           1/1       Running   0          10s
my-spark-cluster-w-m8knz           1/1       Running   0          10s
my-spark-cluster-w-vg9k2           1/1       Running   0          10s
spark-operator-510388731-852b2     1/1       Running   0          27s

Once you don't need the cluster anymore, you can delete it by deleting the config map resource by:

kubectl delete cm my-spark-cluster

Very Quick Start

# create operator
kubectl apply -f http://bit.ly/sparkop

# create cluster
cat <<EOF | kubectl apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
  name: my-cluster
  labels:
    radanalytics.io/kind: sparkcluster
data:
  config: |-
    worker:
      replicas: "2"
EOF

OpenShift

For deployment on OpenShift use the same commands as above, but with oc instead of kubectl.

Custom Resource Definitions (CRD)

This operator can also work with CRDs. Assuming the admin user is logged in, you can install the operator with:

kubectl apply -f manifest/operator-crd.yaml

and then create the Spark clusters by creating the custom resources (CR).

kubectl apply -f examples/cluster-cr.yaml
kubectl get sparkclusters

Images

Image name Description Layers quay.io docker.io
:latest-released represents the latest released version Layers info quay.io repo docker.io repo
:latest represents the master branch Layers info
:x.y.z one particular released version Layers info

For each variant there is also available an image with -alpine suffix based on Alpine for instance Layers info

About

ConfigMap and CRD based approach for managing the Spark clusters in Kubernetes and OpenShift.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Java 57.0%
  • Shell 37.3%
  • Makefile 5.7%