Skip to content

Running on Alicloud

Brian Wandell edited this page Nov 14, 2017 · 12 revisions

After the Alicloud k8s is initialized, we still have several steps we must follow to run a job. Some of these can be done using Matlab commands we will build. Others require logging in to the master node and executing commands by hand. Still others require entering a password.

  1. To copy the local data to cloud disk
scp -r /path/to/local/data root@<master-IP>:/tmp  (use the password that created k8s) 
  1. To make sure that the data persist even after we release the cluster, we use the pv and pvc YAML files. The files for our current configuration are stored in the YAML directory. The kubectl copy is this, but we will build a Matlab command like m2cPersistenStorage that executes
kubectl create -f /path/to/pv.yaml
kubectl create -f /path/to/pvc.yaml
  1. Next we create a YAML file that will control your job. This file will be pushed to the cloud using
kubectl create -f /path/to/job.yaml` (the directory issue really gave me a hard time)

An example pbrt-job file is stored in the YAML directory (example-pbrt-job.yaml). We will make Matlab scripts that take a few input arguments and create a pbrt-job.yaml file (m2cYAMLjob())

apiVersion: batch/v1
kind: Job
metadata:
  name: job-pbrt #choose a Job name
spec:
  template:
    metadata:
      labels:
        name: pbrt
    spec:
      containers:
      - name: pbrt #choose a container name
        image: rendertoolbox/pbrt-v2-spectral     # Define your docker image
        workingDir: "/tmp/pbrt"                   # Where pbrt and related files are saved.
        command: ["pbrt"] 
        args: ["scene.pbrt"]                      # The PBRT scene file to render.
        volumeMounts: 
          - name: pv-storage
            mountPath: "/tmp"                     # Alicloud disk mounted directory
      volumes:
      - name: pv-storage
        persistentVolumeClaim:
          claimName: pv-claim
      restartPolicy: OnFailure
  1. Check your render process
kubectl get pod -a
kubectl logs -f pod-name

Output like this:

sr17-673b6f21e5:Downloads eugeneliu$ kubectl logs -f job-pbrt-nwt6w
pbrt version 2.0.0 of Sep  7 2017 at 22:05:36 [Detected 2 core(s)]
Copyright (c)1998-2010 Matt Pharr and Greg Humphreys.
The source code to pbrt (but *not* the book contents) is covered by the GNU GPL.
See the file COPYING.txt for the conditions of the license.
Warning: No metadata written out.
Rendering: [+++++++++++++++++++++++++++++++++++++++++++]  (14.6s)       
Error in ioctl() in TerminalWidth(): 25
  1. Download file from alicloud disk. After the jobs complete, for some reason Alicloud decides to detach the storage from our nodes. That's ridiculous. This forces us to reattach. So we have to redo step 6 on the Initialization. We are complaining. To remind you, that step is

Go to Alicloud ECS US-West-1, manually mount the cloud disk to master ECS instance. Then on the master node of the k8s cluster you execute the mount command

 mount /dev/vdb /tmp

This enables you to copy files from cloud to local, using

scp -r root@:/tmp/remoteFile /local/directory (password needed)