SVM

SVM is used for classification and regression analysis.

1. Introduction

SVM solves the following optimization problem:

where is the regularization term; is the regularization coefficient; is the hinge loss as visualized below:

2. Distributed Implementation on Angel

Angel MLLib uses mini-batch gradient descent optimization method for solving SVM's objective; the algorithm is shown below:

3. Execution and Performance

Input Format

Data fromat is set in "ml.data.type", supporting "libsvm" and "dummy" types. For details, see Angel Data Format
Feature vector's dimension is set in "ml.feature.num"

Submitting script

Several steps must be done before editing the submitting script and running.

confirm Hadoop and Spark have ready in your environment
unzip sona--bin.zip to local directory (SONA_HOME)
upload sona--bin directory to HDFS (SONA_HDFS_HOME)
Edit $SONA_HOME/bin/spark-on-angel-env.sh, set SPARK_HOME, SONA_HOME, SONA_HDFS_HOME and ANGEL_VERSION

Here's an example of submitting scripts, remember to adjust the parameters and fill in the paths according to your own task.

#test description
actionType=train or predict
jsonFile=path-to-jsons/svm.json
modelPath=path-to-save-model
predictPath=path-to-save-predict-results
input=path-to-data
queue=your-queue

HADOOP_HOME=my-hadoop-home
source ./bin/spark-on-angel-env.sh
export HADOOP_HOME=$HADOOP_HOME

$SPARK_HOME/bin/spark-submit \
  --master yarn-cluster \
  --conf spark.ps.jars=$SONA_ANGEL_JARS \
  --conf spark.ps.instances=10 \
  --conf spark.ps.cores=2 \
  --conf spark.ps.memory=10g \
  --jars $SONA_SPARK_JARS \
  --files $jsonFile \
  --driver-memory 20g \
  --num-executors 20 \
  --executor-cores 5 \
  --executor-memory 30g \
  --queue $queue \
  --class org.apache.spark.angel.examples.JsonRunnerExamples \
  ./lib/angelml-$SONA_VERSION.jar \
  jsonFile:./svm.json \
  dataFormat:libsvm \
  data:$input \
  modelPath:$modelPath \
  predictPath:$predictPath \
  actionType:$actionType \
  numBatch:500 \
  maxIter:2 \
  lr:4.0 \
  numField:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

svm_sona_en.md

svm_sona_en.md

SVM

1. Introduction

2. Distributed Implementation on Angel

3. Execution and Performance

Input Format

Submitting script

Files

svm_sona_en.md

Latest commit

History

svm_sona_en.md

File metadata and controls

SVM

1. Introduction

2. Distributed Implementation on Angel

3. Execution and Performance

Input Format

Submitting script