Create/Update README.md files

xwang2713 · Sep 18, 2019 · 98d681a · 98d681a
1 parent 2dc3874
commit 98d681a
Show file tree

Hide file tree

Showing 27 changed files with 587 additions and 448 deletions.
diff --git a/Deployment/dp-1/README.md b/Deployment/dp-1/README.md
@@ -1,97 +1,110 @@
-# HPCC-Kubernetes
-
-## Deploy a HPCC Cluster with Kubernetes Deployment
-In Kubernetes a [Deployment](https://kubernetes.io/docs/concepts/workloads/controllers/deployment/) is responsible for replicating sets of identical pods.  Like a _Service_ it has a selector query which identifies the members of it's set.  Unlike a _Service_ it also has a desired number of replicas, and it will create or delete _Pods_ to ensure that the number of _Pods_ matches up with it's desired state.
-
-Make sure bin/bootstrap.[sh|bat] started first
-
-```sh
+# Deploy HPCC Systems Cluster with Deployment
+
+This is a simple stateless deployment scenario. It can be used to both local and real cloud, such as AWS.
+
+## Prerequisities
+- Bootstrap 
+  AWS:
+  ```console
+  bin/bootstrap-aws.sh
+  ```
+  Local:
+  ```console
+  bin/bootstrap-local.sh
+  ```
+## Deploy HPCC Systems Cluster
+```console
 ./start
 ```
-To verify the thor and roxie are ready:
-```sh
+To make sure they are up:
+```console
 kubectl get pods
-
-NAME                     READY     STATUS    RESTARTS   AGE
-esp-controller-bbgqu      1/1       Running   0         3m
-esp-controller-wc8ae      1/1       Running   0         3m
-roxie-controller-hmvo5    1/1       Running   0         3m
-roxie-controller-x7ksh    1/1       Running   0         3m
-thor-controller-2sbe5     1/1       Running   0         3m
-thor-controller-p1q7f     1/1       Running   0         3m
+NAME                               READY   STATUS    RESTARTS   AGE
+efs-provisioner-57965c4946-7w4b5   1/1     Running   0          2d16h
+esp-esp1-69b59769bd-94gm4          1/1     Running   0          16s
+hpcc-admin                         1/1     Running   0          19s
+roxie-roxie1-64d49d76cf-b28gh      1/1     Running   0          15s
+support-778c8ffbb-p44t7            1/1     Running   0          17s
+thor-thor1-75bb466cbf-skqj5        1/1     Running   0          13s
+thormaster-thor1                   1/1     Running   0          14s
 ```
-To start master instance:
-```sh
-kubectl create -f master-controller.yaml
+The cluster should be automatically configured and started.
+To verify the status
+```console
+bin/cluster_run.sh status
+Status of esp-esp1-69b59769bd-94gm4:
+mydafilesrv     ( pid      981 ) is running ...
+esp1            ( pid     1175 ) is running ...
+
+Status of roxie-roxie1-64d49d76cf-b28gh:
+mydafilesrv     ( pid      969 ) is running ...
+roxie1          ( pid     1168 ) is running ...
+
+Status of support-778c8ffbb-p44t7:
+mydafilesrv     ( pid     1006 ) is running ...
+mydali          ( pid     1200 ) is running ...
+mydfuserver     ( pid     1413 ) is running ...
+myeclagent      ( pid     1629 ) is running ...
+myeclccserver   ( pid     1832 ) is running ...
+myeclscheduler  ( pid     2049 ) is running ...
+mysasha         ( pid     2255 ) is running ...
+
+Status of thor-thor1-75bb466cbf-skqj5:
+mydafilesrv     ( pid      962 ) is running ...
+
+Status of thormaster-thor1:
+mydafilesrv     ( pid      969 ) is running ...
+thor1           ( pid     1214 ) is running with 1 slave process(es) ...
+``
+
+## Scale up/down
+Original roxie-roxie1 cluster has 1 instances. To increase it to 4 instances:
+```console
+kubeclt scale --replicas 2 StatefulSet/roxie-roxie1
+NAME                               READY   STATUS    RESTARTS   AGE
+efs-provisioner-57965c4946-7w4b5   1/1     Running   0          2d16h
+esp-esp1-69b59769bd-94gm4          1/1     Running   0          3m15s
+hpcc-admin                         1/1     Running   0          3m18s
+roxie-roxie1-64d49d76cf-b28gh      1/1     Running   0          3m14s
+roxie-roxie1-64d49d76cf-mflqn      1/1     Running   0          11s
+support-778c8ffbb-p44t7            1/1     Running   0          3m16s
+thor-thor1-75bb466cbf-skqj5        1/1     Running   0          3m12s
+thormaster-thor1                   1/1     Running   0          3m13s
 ```
-Make sure it is up and ready:
-```sh
-kubectl get rc master-controller
-NAME                DESIRED   CURRENT   AGE
-master-controller   1         1         12h
-
-kubectl get pods
-NAME                      READY     STATUS    RESTARTS   AGE
-esp-controller-bbgqu      1/1       Running   0          5m
-esp-controller-wc8ae      1/1       Running   0          5m
-master-controller-wa5z8   1/1       Running   0          5m
-roxie-controller-hmvo5    1/1       Running   0          5m
-roxie-controller-x7ksh    1/1       Running   0          5m
-thor-controller-2sbe5     1/1       Running   0          5m
-thor-controller-p1q7f     1/1       Running   0          5m
-
-
-### Access ECLWatch and Verify the cluster
-Get mastr ip:
-```sh
-kubectl get pod master-controller-ar6jn -o json | grep podIP
-        "podIP": "172.17.0.5",
+To scale it back
+```console
+kubeclt scale --replicas 1 Deployment/roxie-roxie1
 ```
-If everything run OK you should access ECLWatch to verify the configuration: ```http://172.17.0.5:8010```. Again if you can't access the private ip you can try to tunnel it above described in deploy single HPCC instance.
 
-If something go wrong you can access the master instance:
-```sh
-kubectl exec master-controller-ar6jn -i -t -- bash -il
+## Auto-scaling
+A sample autoscaling yaml file is provided. You can modify it and apply it
+```console
+kubectl apply -f esp-e1-hpa.yaml
 ```
-configuration scripts, log ile and outputs are under /tmp/
+Increase esp Pod cpu, for example run a big loop and monitor the auto-scaling.
 
-
-### Start a load balancer on esp
-When deploy Kubernetes on a cloud such as AWS you can create load balancer for esp
-```sh
-kubectl create -f esp-service.yaml
-```
-Make sure the service is up
-```sh
-kubectl get service
-NAME         CLUSTER-IP    EXTERNAL-IP        PORT(S)    AGE
-esp          10.0.21.220   a2c49b2864c79...   8001/TCP   3h
-kubernetes   10.0.0.1      <none>             443/TCP    3d
+The disable auto-scaling:
+```console
+kubectl delete -f esp-e1-hpa.yaml
 ```
 
-The "EXTERNAL-IP" is too long.
-```sh
-kubectl get service -o json | grep a2c49b2864c79
-"hostname": "a2c49b2864c7911e6ab6506c30bb0563-401114081.eu-west-1.elb.amazonaws.com"
+## Stop/Start Cluster
+stop
+```console
+bin/cluster-run stop
 ```
-2c49b2864c7911e6ab6506c30bb0563-401114081.eu-west-1.elb.amazonaws.com" and we define the port 8001. so 2c49b2864c7911e6ab6506c30bb0563-401114081.eu-west-1.elb.amazonaws.com:8001 should display eclwatch
-
-
-
-
-### Scale thor and roxie replicated pods
-For example, to add one more thor and make total 3 thor slaves:
-```sh
-kubectl scale rc thor --replicas=3
+start
+```console
+bin/cluster-run start
 ```
 
-```Note```: we need more tests on this area, particularly need restart /tmp/run_master.sh to allow re-collect pod ips, generate new environment.xml and stop/start HPCC cluster.
+Get status
+```console
+bin/cluster-run status
 
-### Stop and delete HPCC cluster
-```sh
-kubectl delete -f esp-service.yaml
-kubectl delete -f thor-controller.yaml
-kubectl delete -f roxie-controller.yaml
-kubectl delete -f esp-controller.yaml
-kubectl delete -f master-controller.yaml
 ```
+
+## Delete Cluster ###
+```console
+./stop
+```
diff --git a/Deployment/dp-1/esp-e1-autoscale.yaml → Deployment/dp-1/esp-e1-hpa.yaml b/Deployment/dp-1/esp-e1-autoscale.yaml → Deployment/dp-1/esp-e1-hpa.yaml
diff --git a/Deployment/ebs/ebs-1/README.md b/Deployment/ebs/ebs-1/README.md
@@ -1,3 +1,4 @@
 # Deployment with EBS
 Only one PersistentVolumeClaim created per Deployment yaml file
 
+Need provide each PersistentVolumeClaim in Pod. Can't dynamically creat volume and attach to scale-up Pod automatically. Use StatefulSet instead unless there are some methods we are not aware.
diff --git a/Deployment/efs/efs-1/README.md b/Deployment/efs/efs-1/README.md
@@ -0,0 +1,174 @@
+# Deploy Dali/Sasha/DropZone/Roxie/Thor Pods as Deployment/EFS
+
+Generally this is prefer way to deploy cluster with EFS since typically share mode is ReadWriteMany.
+
+Current deployment has Sasha/DropZone in support Pod.
+
+Even EFS performance may not be good as EBS but EFS it is very convenient such as:
+- Don't need worry about cross AZz
+- Easy to share and re-use data
+- Don't need to worry to delete volume after deleting Pod. 
+
+EFS is little expensive than EBS.
+
+## Performance
+to do (compare EFS and EBS)
+
+## Prerequisities
+- Bootstrap
+  ```console
+  bin/bootstrap-aws.sh
+  ```
+- Start NFS server
+  in efs/
+  ```console
+  ./apply.sh
+  ```
+  apply.sh appl rbac.yaml and manifest.yaml
+  To display NFS pod:
+  ```console
+  kubectl get pods
+  NAME                               READY   STATUS    RESTARTS   AGE
+  efs-provisioner-57965c4946-7w4b5   1/1     Running   0          2d15h
+  ```
+  To display PV and PVC:
+  ```console
+  kubectl get pv
+  NAME                                       CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS   CLAIM         STORAGECLASS   REASON   AGE
+  pvc-bdf95dd2-d820-11e9-87ee-0e00576dcdfc   1Mi        RWX            Delete           Bound    default/efs   aws-efs                 2d15h
+
+  kubectl get pvc
+  NAME   STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
+  efs    Bound    pvc-bdf95dd2-d820-11e9-87ee-0e00576dcdfc   1Mi        RWX            aws-efs        2d15h
+
+  The Volume Claim name is "efs". The storage class is "aws-efs"
+
+## Deploy HPCC Systems Cluster
+```console
+./start
+```
+To make sure they are up:
+```console
+kubectl get pods
+NAME                               READY   STATUS    RESTARTS   AGE
+dali                               1/1     Running   0          51s
+efs-provisioner-57965c4946-7w4b5   1/1     Running   0          2d15h
+esp-esp1-f5bc48677-znlsv           1/1     Running   0          49s
+hpcc-admin                         1/1     Running   0          52s
+roxie-roxie1-84f9578895-fbpld      1/1     Running   0          47s
+roxie-roxie1-84f9578895-gpf6x      1/1     Running   0          47s
+roxie-roxie2-6cf55ffd45-nf2mp      1/1     Running   0          46s
+support-db468c5c9-8kqzd            1/1     Running   0          50s
+thor-thor1-59876665f5-67p4p        1/1     Running   0          44s
+thor-thor1-59876665f5-nn7wc        1/1     Running   0          44s
+thormaster-thor1                   1/1     Running   0          45s
+```
+
+The cluster should be automatically configured and started.
+To verify the status
+```console
+bin/cluster_run.sh status
+Status of dali:
+mydafilesrv     ( pid      972 ) is running ...
+mydali          ( pid     1166 ) is running ...
+
+Status of esp-esp1-f5bc48677-znlsv:
+mydafilesrv     ( pid      991 ) is running ...
+esp1            ( pid     1185 ) is running ...
+
+Status of roxie-roxie1-84f9578895-fbpld:
+mydafilesrv     ( pid      978 ) is running ...
+roxie1          ( pid     1177 ) is running ...
+
+Status of roxie-roxie1-84f9578895-gpf6x:
+mydafilesrv     ( pid      978 ) is running ...
+roxie1          ( pid     1177 ) is running ...
+
+Status of roxie-roxie2-6cf55ffd45-nf2mp:
+mydafilesrv     ( pid      979 ) is running ...
+roxie2          ( pid     1178 ) is running ...
+
+Status of support-db468c5c9-8kqzd:
+mydafilesrv     ( pid     1010 ) is running ...
+mydfuserver     ( pid     1204 ) is running ...
+myeclagent      ( pid     1413 ) is running ...
+myeclccserver   ( pid     1633 ) is running ...
+myeclscheduler  ( pid     1851 ) is running ...
+mysasha         ( pid     2059 ) is running ...
+
+Status of thor-thor1-59876665f5-67p4p:
+mydafilesrv     ( pid      972 ) is running ...
+
+Status of thor-thor1-59876665f5-nn7wc:
+mydafilesrv     ( pid      972 ) is running ...
+
+Status of thormaster-thor1:
+mydafilesrv     ( pid      978 ) is running ...
+thor1           ( pid     1243 ) is running with 2 slave process(es) ...
+
+```
+
+
+## Access ECLWatch
+Get esp public ip:
+```console
+kubectl get service
+NAME         TYPE           CLUSTER-IP      EXTERNAL-IP                                                               PORT(S)          AGE
+ew-esp1      LoadBalancer   10.100.248.99   a0e9629f2da3811e9b1b40aa0a5b6276-2050531242.us-east-1.elb.amazonaws.com   8010:30108/TCP   3m52s
+
+```
+ECLWatch URL: http://a0e9629f2da3811e9b1b40aa0a5b6276-2050531242.us-east-1.elb.amazonaws.com:8010
+
+## Scale up/down
+Original roxie-roxie1 cluster has 2 instances. To increase it to 4 instances:
+```console
+kubeclt scale --replicas 6 StatefulSet/roxie-roxie1
+
+kubeclt get pods
+NAME                               READY   STATUS    RESTARTS   AGE
+dali                               1/1     Running   0          6m10s
+efs-provisioner-57965c4946-7w4b5   1/1     Running   0          2d15h
+esp-esp1-f5bc48677-znlsv           1/1     Running   0          6m8s
+hpcc-admin                         1/1     Running   0          6m11s
+roxie-roxie1-84f9578895-7p4qz      1/1     Running   0          29s
+roxie-roxie1-84f9578895-8tpxp      1/1     Running   0          29s
+roxie-roxie1-84f9578895-fbpld      1/1     Running   0          6m6s
+roxie-roxie1-84f9578895-gpf6x      1/1     Running   0          6m6s
+roxie-roxie1-84f9578895-tjlp9      1/1     Running   0          29s
+roxie-roxie1-84f9578895-v62dj      1/1     Running   0          29s
+roxie-roxie2-6cf55ffd45-nf2mp      1/1     Running   0          6m5s
+support-db468c5c9-8kqzd            1/1     Running   0          6m9s
+thor-thor1-59876665f5-67p4p        1/1     Running   0          6m3s
+thor-thor1-59876665f5-nn7wc        1/1     Running   0          6m3s
+thormaster-thor1                   1/1     Running   0          6m4s
+
+```
+To scale it back
+```console
+kubeclt scale --replicas 2 Deployment/roxie-roxie1
+```
+
+
+## Stop/Start Cluster
+stop
+```console
+bin/cluster-run stop
+```
+start
+```console
+bin/cluster-run start
+```
+
+Get status
+```console
+bin/cluster-run status
+
+```
+
+## Delete Cluster ###
+```console
+./stop
+```
+This does not delete volumes. Either use AWS Client or go to EC2 console to delete them.
+
+