Skip to content

Commit

Permalink
Create/Update README.md files
Browse files Browse the repository at this point in the history
  • Loading branch information
xwang2713 committed Sep 18, 2019
1 parent 2dc3874 commit 98d681a
Show file tree
Hide file tree
Showing 27 changed files with 587 additions and 448 deletions.
175 changes: 94 additions & 81 deletions Deployment/dp-1/README.md
Original file line number Diff line number Diff line change
@@ -1,97 +1,110 @@
# HPCC-Kubernetes

## Deploy a HPCC Cluster with Kubernetes Deployment
In Kubernetes a [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) is responsible for replicating sets of identical pods. Like a _Service_ it has a selector query which identifies the members of it's set. Unlike a _Service_ it also has a desired number of replicas, and it will create or delete _Pods_ to ensure that the number of _Pods_ matches up with it's desired state.

Make sure bin/bootstrap.[sh|bat] started first

```sh
# Deploy HPCC Systems Cluster with Deployment

This is a simple stateless deployment scenario. It can be used to both local and real cloud, such as AWS.

## Prerequisities
- Bootstrap
AWS:
```console
bin/bootstrap-aws.sh
```
Local:
```console
bin/bootstrap-local.sh
```
## Deploy HPCC Systems Cluster
```console
./start
```
To verify the thor and roxie are ready:
```sh
To make sure they are up:
```console
kubectl get pods

NAME READY STATUS RESTARTS AGE
esp-controller-bbgqu 1/1 Running 0 3m
esp-controller-wc8ae 1/1 Running 0 3m
roxie-controller-hmvo5 1/1 Running 0 3m
roxie-controller-x7ksh 1/1 Running 0 3m
thor-controller-2sbe5 1/1 Running 0 3m
thor-controller-p1q7f 1/1 Running 0 3m
NAME READY STATUS RESTARTS AGE
efs-provisioner-57965c4946-7w4b5 1/1 Running 0 2d16h
esp-esp1-69b59769bd-94gm4 1/1 Running 0 16s
hpcc-admin 1/1 Running 0 19s
roxie-roxie1-64d49d76cf-b28gh 1/1 Running 0 15s
support-778c8ffbb-p44t7 1/1 Running 0 17s
thor-thor1-75bb466cbf-skqj5 1/1 Running 0 13s
thormaster-thor1 1/1 Running 0 14s
```
To start master instance:
```sh
kubectl create -f master-controller.yaml
The cluster should be automatically configured and started.
To verify the status
```console
bin/cluster_run.sh status
Status of esp-esp1-69b59769bd-94gm4:
mydafilesrv ( pid 981 ) is running ...
esp1 ( pid 1175 ) is running ...

Status of roxie-roxie1-64d49d76cf-b28gh:
mydafilesrv ( pid 969 ) is running ...
roxie1 ( pid 1168 ) is running ...

Status of support-778c8ffbb-p44t7:
mydafilesrv ( pid 1006 ) is running ...
mydali ( pid 1200 ) is running ...
mydfuserver ( pid 1413 ) is running ...
myeclagent ( pid 1629 ) is running ...
myeclccserver ( pid 1832 ) is running ...
myeclscheduler ( pid 2049 ) is running ...
mysasha ( pid 2255 ) is running ...

Status of thor-thor1-75bb466cbf-skqj5:
mydafilesrv ( pid 962 ) is running ...

Status of thormaster-thor1:
mydafilesrv ( pid 969 ) is running ...
thor1 ( pid 1214 ) is running with 1 slave process(es) ...
``

## Scale up/down
Original roxie-roxie1 cluster has 1 instances. To increase it to 4 instances:
```console
kubeclt scale --replicas 2 StatefulSet/roxie-roxie1
NAME READY STATUS RESTARTS AGE
efs-provisioner-57965c4946-7w4b5 1/1 Running 0 2d16h
esp-esp1-69b59769bd-94gm4 1/1 Running 0 3m15s
hpcc-admin 1/1 Running 0 3m18s
roxie-roxie1-64d49d76cf-b28gh 1/1 Running 0 3m14s
roxie-roxie1-64d49d76cf-mflqn 1/1 Running 0 11s
support-778c8ffbb-p44t7 1/1 Running 0 3m16s
thor-thor1-75bb466cbf-skqj5 1/1 Running 0 3m12s
thormaster-thor1 1/1 Running 0 3m13s
```
Make sure it is up and ready:
```sh
kubectl get rc master-controller
NAME DESIRED CURRENT AGE
master-controller 1 1 12h

kubectl get pods
NAME READY STATUS RESTARTS AGE
esp-controller-bbgqu 1/1 Running 0 5m
esp-controller-wc8ae 1/1 Running 0 5m
master-controller-wa5z8 1/1 Running 0 5m
roxie-controller-hmvo5 1/1 Running 0 5m
roxie-controller-x7ksh 1/1 Running 0 5m
thor-controller-2sbe5 1/1 Running 0 5m
thor-controller-p1q7f 1/1 Running 0 5m


### Access ECLWatch and Verify the cluster
Get mastr ip:
```sh
kubectl get pod master-controller-ar6jn -o json | grep podIP
"podIP": "172.17.0.5",
To scale it back
```console
kubeclt scale --replicas 1 Deployment/roxie-roxie1
```
If everything run OK you should access ECLWatch to verify the configuration: ```http://172.17.0.5:8010```. Again if you can't access the private ip you can try to tunnel it above described in deploy single HPCC instance.

If something go wrong you can access the master instance:
```sh
kubectl exec master-controller-ar6jn -i -t -- bash -il
## Auto-scaling
A sample autoscaling yaml file is provided. You can modify it and apply it
```console
kubectl apply -f esp-e1-hpa.yaml
```
configuration scripts, log ile and outputs are under /tmp/
Increase esp Pod cpu, for example run a big loop and monitor the auto-scaling.

### Start a load balancer on esp
When deploy Kubernetes on a cloud such as AWS you can create load balancer for esp
```sh
kubectl create -f esp-service.yaml
```
Make sure the service is up
```sh
kubectl get service
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
esp 10.0.21.220 a2c49b2864c79... 8001/TCP 3h
kubernetes 10.0.0.1 <none> 443/TCP 3d
The disable auto-scaling:
```console
kubectl delete -f esp-e1-hpa.yaml
```

The "EXTERNAL-IP" is too long.
```sh
kubectl get service -o json | grep a2c49b2864c79
"hostname": "a2c49b2864c7911e6ab6506c30bb0563-401114081.eu-west-1.elb.amazonaws.com"
## Stop/Start Cluster
stop
```console
bin/cluster-run stop
```
2c49b2864c7911e6ab6506c30bb0563-401114081.eu-west-1.elb.amazonaws.com" and we define the port 8001. so 2c49b2864c7911e6ab6506c30bb0563-401114081.eu-west-1.elb.amazonaws.com:8001 should display eclwatch
### Scale thor and roxie replicated pods
For example, to add one more thor and make total 3 thor slaves:
```sh
kubectl scale rc thor --replicas=3
start
```console
bin/cluster-run start
```

```Note```: we need more tests on this area, particularly need restart /tmp/run_master.sh to allow re-collect pod ips, generate new environment.xml and stop/start HPCC cluster.
Get status
```console
bin/cluster-run status

### Stop and delete HPCC cluster
```sh
kubectl delete -f esp-service.yaml
kubectl delete -f thor-controller.yaml
kubectl delete -f roxie-controller.yaml
kubectl delete -f esp-controller.yaml
kubectl delete -f master-controller.yaml
```

## Delete Cluster ###
```console
./stop
```
File renamed without changes.
1 change: 1 addition & 0 deletions Deployment/ebs/ebs-1/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
# Deployment with EBS
Only one PersistentVolumeClaim created per Deployment yaml file

Need provide each PersistentVolumeClaim in Pod. Can't dynamically creat volume and attach to scale-up Pod automatically. Use StatefulSet instead unless there are some methods we are not aware.
174 changes: 174 additions & 0 deletions Deployment/efs/efs-1/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Deploy Dali/Sasha/DropZone/Roxie/Thor Pods as Deployment/EFS

Generally this is prefer way to deploy cluster with EFS since typically share mode is ReadWriteMany.

Current deployment has Sasha/DropZone in support Pod.

Even EFS performance may not be good as EBS but EFS it is very convenient such as:
- Don't need worry about cross AZz
- Easy to share and re-use data
- Don't need to worry to delete volume after deleting Pod.

EFS is little expensive than EBS.

## Performance
to do (compare EFS and EBS)

## Prerequisities
- Bootstrap
```console
bin/bootstrap-aws.sh
```
- Start NFS server
in efs/
```console
./apply.sh
```
apply.sh appl rbac.yaml and manifest.yaml
To display NFS pod:
```console
kubectl get pods
NAME READY STATUS RESTARTS AGE
efs-provisioner-57965c4946-7w4b5 1/1 Running 0 2d15h
```
To display PV and PVC:
```console
kubectl get pv
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS CLAIM STORAGECLASS REASON AGE
pvc-bdf95dd2-d820-11e9-87ee-0e00576dcdfc 1Mi RWX Delete Bound default/efs aws-efs 2d15h

kubectl get pvc
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
efs Bound pvc-bdf95dd2-d820-11e9-87ee-0e00576dcdfc 1Mi RWX aws-efs 2d15h

The Volume Claim name is "efs". The storage class is "aws-efs"

## Deploy HPCC Systems Cluster
```console
./start
```
To make sure they are up:
```console
kubectl get pods
NAME READY STATUS RESTARTS AGE
dali 1/1 Running 0 51s
efs-provisioner-57965c4946-7w4b5 1/1 Running 0 2d15h
esp-esp1-f5bc48677-znlsv 1/1 Running 0 49s
hpcc-admin 1/1 Running 0 52s
roxie-roxie1-84f9578895-fbpld 1/1 Running 0 47s
roxie-roxie1-84f9578895-gpf6x 1/1 Running 0 47s
roxie-roxie2-6cf55ffd45-nf2mp 1/1 Running 0 46s
support-db468c5c9-8kqzd 1/1 Running 0 50s
thor-thor1-59876665f5-67p4p 1/1 Running 0 44s
thor-thor1-59876665f5-nn7wc 1/1 Running 0 44s
thormaster-thor1 1/1 Running 0 45s
```

The cluster should be automatically configured and started.
To verify the status
```console
bin/cluster_run.sh status
Status of dali:
mydafilesrv ( pid 972 ) is running ...
mydali ( pid 1166 ) is running ...

Status of esp-esp1-f5bc48677-znlsv:
mydafilesrv ( pid 991 ) is running ...
esp1 ( pid 1185 ) is running ...

Status of roxie-roxie1-84f9578895-fbpld:
mydafilesrv ( pid 978 ) is running ...
roxie1 ( pid 1177 ) is running ...

Status of roxie-roxie1-84f9578895-gpf6x:
mydafilesrv ( pid 978 ) is running ...
roxie1 ( pid 1177 ) is running ...

Status of roxie-roxie2-6cf55ffd45-nf2mp:
mydafilesrv ( pid 979 ) is running ...
roxie2 ( pid 1178 ) is running ...

Status of support-db468c5c9-8kqzd:
mydafilesrv ( pid 1010 ) is running ...
mydfuserver ( pid 1204 ) is running ...
myeclagent ( pid 1413 ) is running ...
myeclccserver ( pid 1633 ) is running ...
myeclscheduler ( pid 1851 ) is running ...
mysasha ( pid 2059 ) is running ...

Status of thor-thor1-59876665f5-67p4p:
mydafilesrv ( pid 972 ) is running ...

Status of thor-thor1-59876665f5-nn7wc:
mydafilesrv ( pid 972 ) is running ...

Status of thormaster-thor1:
mydafilesrv ( pid 978 ) is running ...
thor1 ( pid 1243 ) is running with 2 slave process(es) ...

```


## Access ECLWatch
Get esp public ip:
```console
kubectl get service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ew-esp1 LoadBalancer 10.100.248.99 a0e9629f2da3811e9b1b40aa0a5b6276-2050531242.us-east-1.elb.amazonaws.com 8010:30108/TCP 3m52s

```
ECLWatch URL: http://a0e9629f2da3811e9b1b40aa0a5b6276-2050531242.us-east-1.elb.amazonaws.com:8010

## Scale up/down
Original roxie-roxie1 cluster has 2 instances. To increase it to 4 instances:
```console
kubeclt scale --replicas 6 StatefulSet/roxie-roxie1

kubeclt get pods
NAME READY STATUS RESTARTS AGE
dali 1/1 Running 0 6m10s
efs-provisioner-57965c4946-7w4b5 1/1 Running 0 2d15h
esp-esp1-f5bc48677-znlsv 1/1 Running 0 6m8s
hpcc-admin 1/1 Running 0 6m11s
roxie-roxie1-84f9578895-7p4qz 1/1 Running 0 29s
roxie-roxie1-84f9578895-8tpxp 1/1 Running 0 29s
roxie-roxie1-84f9578895-fbpld 1/1 Running 0 6m6s
roxie-roxie1-84f9578895-gpf6x 1/1 Running 0 6m6s
roxie-roxie1-84f9578895-tjlp9 1/1 Running 0 29s
roxie-roxie1-84f9578895-v62dj 1/1 Running 0 29s
roxie-roxie2-6cf55ffd45-nf2mp 1/1 Running 0 6m5s
support-db468c5c9-8kqzd 1/1 Running 0 6m9s
thor-thor1-59876665f5-67p4p 1/1 Running 0 6m3s
thor-thor1-59876665f5-nn7wc 1/1 Running 0 6m3s
thormaster-thor1 1/1 Running 0 6m4s

```
To scale it back
```console
kubeclt scale --replicas 2 Deployment/roxie-roxie1
```


## Stop/Start Cluster
stop
```console
bin/cluster-run stop
```
start
```console
bin/cluster-run start
```

Get status
```console
bin/cluster-run status

```

## Delete Cluster ###
```console
./stop
```
This does not delete volumes. Either use AWS Client or go to EC2 console to delete them.


Loading

0 comments on commit 98d681a

Please sign in to comment.