PERMON project homepage: http://permon.vsb.cz PermonSVM homepage: http://permon.vsb.cz/permonsvm.htm
Please use GitHub for issues and pull requests.
- Scalable (parallel) solution for the linear C-SVM
- Supported binary classifications:
- standard classification (linear and bound constraints)
- relaxed-bias classification (bound constraints)
- Misclassification error quantification:
- l1 hinge-loss function
- l2 hinge-loss function
- Standard classification solvers:
- SMALXE + MPRGP (active-set method for bound constrained problems)
- SMALXE + The Toolkit for Advance Optimization (TAO) solvers for minimization with bound constraints
- Relaxed-bias classification solvers:
- MPRGP
- TAO solvers for minimization with bound constraints
- Warm start
- Grid search
- Cross validation types:
- k-fold
- stratified k-fold
- Model perfomance scores:
- accuracy
- sensitivity
- specifity
- F1
- Matthews correlation coefficient
- Area Under Curve (AUC) Receiver Operating Characteristics (ROC)
- Gini coefficient
- Parallel data loaders:
- PETSc binary
- HDF5 (AIJ and dense matrices)
- SVMLight
- install PermonQP (follow instructions in its own README.md)
- set
PERMON_SVM_DIR
variable pointing to the PermonSVM directory (probably this file's parent directory) - build PermonSVM simply using makefile (makes use of PETSc buildsystem):
make
- if the build is successful, there is a new subdirectory named
$PETSC_ARCH
with the program library$PETSC_ARCH/lib/libpermonsvm.{dylib,so,a}
and the executable$PETSC_ARCH/bin/permonsvmfile
- shared library (.so) is built just if PETSc has been configured with option
--with-shared-libraries
- all compiler settings are inherited from PETSc and PermonQP
- shared library (.so) is built just if PETSc has been configured with option
- Tutorials illustrating basic functionality of the package are located in
src/tutorials
. - We also provide the bash script runsvmmpi in the root directory of PermonSVM to easily run minimal working example
src/bin/permonsvmfile.c
. - Several training and test datasets are located in
DATA_DIR=src/tutorials/data
. - Please set the
DATA_DIR
variable before running following examples.
-
running PermonSVM on 2 MPI processes with default settings (relaxed-bias classification, l1 hinge loss, C = 1, B = 1)
./runsvmmpi 2 -f_training $DATA_DIR/heart_scale.bin -f_test $DATA_DIR/heart_scale.t.bin
-
running PermonSVM on 2 MPI processes with penalty parameter C = 100
./runsvmmpi 2 -f_training $DATA_DIR/heart_scale.bin -f_test $DATA_DIR/heart_scale.t.bin \ -svm_C 100
-
running PermonSVM on 2 MPI processes with C = 0.01 and l2 hinge loss
./runsvmmpi 2 -f_training $DATA_DIR/heart_scale.bin -f_test $DATA_DIR/heart_scale.t.bin \ -svm_loss_type L2 -svm_C 1e-2
-
running PermonSVM on 2 MPI processes solving standard classification problem (binary mod 1), missclassification error quantification by l2 hinge loss, and C = 0.01
./runsvmmpi 2 -f_training $DATA_DIR/heart_scale.bin -f_test $DATA_DIR/heart_scale.t.bin \ -svm_loss_type L2 -svm_C 1e-2 -svm_binary_mod 1
-
running PermonSVM on 2 MPI processes with hyperparameter optimization with default settings (l1 hinge loss function, relaxed-bias classification, grid-search log2C = [-2:1:2], k-fold cross validation on 5 folds)
./runsvmmpi 2 -f_training $DATA_DIR/heart_scale.bin -f_test $DATA_DIR/heart_scale.t.bin \ -svm_hyperopt 1
-
running PermonSVM on 2 MPI processes with grid-search on C = {0.1, 1, 10, 100} combined with cross validation on 3 folds that reuses a previous solution (warm start)
./runsvmmpi 2 -f_training $DATA_DIR/heart_scale.bin -f_test $DATA_DIR/heart_scale.t.bin \ -svm_hyperopt 1 -svm_gs_logC_base 10 -svm_gs_logC_stride 1,2,1 -svm_nfolds 3 -cross_svm_warm_start 1
-
running PermonSVM on 2 MPI processes with grid search on C = {0.1, 1, 10, 100} and stratified k-fold cross validation on 3 folds with warm start
./runsvmmpi 2 -f_training $DATA_DIR/heart_scale.bin -f_test $DATA_DIR/heart_scale.t.bin \ -svm_hyperopt 1 -svm_gs_logC_base 10 -svm_gs_logC_stride 1,2,1 -svm_nfolds 3 -cross_svm_warm_start 1 \ -svm_cv_type stratified_kfold
PermonSVM uses an implicit representation of the Gramian matrix by default. Sometimes, it is reasonable to compute inner products related to the Gramian explicitly, typically, when a number of features is disproportionately larger than a number of samples. For such cases, PermonSVM provides functionality allowing to load precomputed Gramian matrix.
./runsvmmpi 2 -f_training $DATA_DIR/heart_scale.bin -f_test $DATA_DIR/heart_scale.t.bin \
-f_kernel $DATA_DIR/heart_scale.kernel.bin
The training dataset src/tutorials/data/heart_scale
and testing dataset src/tutorials/data/heart_scale.t
have been obtained by splitting the heart_scale
dataset from the LIBSVM dataset page.
PERMON tries to support newest versions of PETSc as soon as possible. The releases are tagged with major.minor.sub-minor numbers. The major.minor numbers correspond to the major.minor release numbers of the supported PERMON/PETSc version.