Skip to content
Matteo Santoro edited this page Jun 24, 2013 · 20 revisions

Table of Contents

User Manual

A comprehensive user manual is here. This page contains a few examples for basic learning pipelines and is a great place to start.

Design

GURLS (GURLS++) basically consists of a set of tasks, each one belonging to a predefined category, and of a method (a class in the C++ implementation) called GURLS Core that is responsible for processing an ordered sequence of tasks called pipeline. An additional ”options structure”, often referred to as OPT, is used to store all configuration parameters needed to customize the tasks behaviour. Tasks receive configuration parameters from the options structure in read-only mode and, after terminating, their results are appended to the structure by the GURLS Core in order to make them available to the subsequent tasks. This allows the user to easily skip the execution of some tasks in a pipeline, by simply inserting the desired results directly into the options structure. All tasks belonging to the same category can be interchanged with each other, so that the user can easily choose how each task shall be carried out.

Gurls DesignGurls Design

GURLS Usage

The gurls command accepts exaclty four arguments:

  • The NxD input data matrix (N is the number of samples, D is the number of variables).
  • The NxT output labels matrix (T is the number of outputs. For (multi-class) classification, labels +1 and -1 must be in the One-Vs-All format)..
  • An options' structure.
  • A job-id number.
Each time the data need to be changed (e.g. going from training phase to testing phase) gurls needs to be called again.

The three main fields in the options' structure are:

  • opt.name: defines a name for a given experiment.
  • opt.seq: specifies the sequence of tasks to be executed.
  • opt.process: specifies what to do with each task. In particular here are the codes:
    • 0 = Ignore
    • 1 = Compute
    • 2 = Compute and save
    • 3 = Load from file
    • 4 = Explicitly delete
The gurls command executes an ordered sequence of tasks, the 'pipeline', specified in the field seq of the options' structure as
  {'<CATEGORY1>:<TASK1>';'<CATEGORY2>:<TASK2>';...}

These tasks can be combined in order to build different train-test pipelines. The most popular learning pipelines are outlined in the following.

Examples in GURLS

Linear classifier, primal case, leave one out cv

We want to run the training on a dataset {Xtr,ytr} and the test on a different dataset {Xte,yte}. We are interested in the precision-recall performance measure as well as the average classification accuracy. In order to train a linear classifier using a leave one out cross-validation approach, we just need the following lines of code:

 name = 'ExampleExperiment';
 opt = defopt(name);
 opt.seq = {'paramsel:loocvprimal','rls:primal','pred:primal','perf:precrec','perf:macroavg'};
 opt.process{1} = [2,2,0,0,0];
 opt.process{2} = [3,3,2,2,2];
 gurls (Xtr, ytr, opt,1)
 gurls (Xte, yte, opt,2)

The meaning of the above code fragment is the following:

  • For the training data: calculate the regularization parameter lambda, minimizing classification accuracy via Leave-One-Out cross-validation and save the result, solve RLS for a linear classifier in the primal space and save the solution. Ignore the rest.
  • For the test data set, load the used lambda (this is important if you want to save this value for further reference), load the classifier. Predict the output on the test-set and save it. Evaluate the two aforementioned performance measures and save it.
Note that the field opt.name is implicitly specified by the defopt function which assigns to it its only input argument. Fields opt.seq and opt.process have to be explicitly assigned.

Normalized data, linear classifier, primal case, hold-out cv

 name = 'ExampleExperiment';
 opt = defopt(name);
 [Xtr] = norm_zscore(Xtr, ytr, opt); 
 [Xte] = norm_testzscore(Xte, yte, opt); 
 opt.seq = {'split:ho','paramsel:hoprimal','rls:primal','pred:primal','perf:macroavg'};
 opt.process{1} = [2,2,2,0,0];
 opt.process{2} = [3,3,3,2,2];
 gurls (Xtr, ytr, opt,1)
 gurls (Xte, yte, opt,2)

Here the training set is first normalized and the column-wise means and covariances are saved to file. Then the test data are normalized according to the stats computed with the training set.

Linear classifier, dual case, leave one out cv

 name = 'ExampleExperiment';
 opt = defopt(name);
 opt.seq = {'kernel:linear', 'paramsel:loocvdual', 'rls:dual', 'pred:dual', 'perf:macroavg'};
 opt.process{1} = [2,2,2,0,0];
 opt.process{2} = [3,3,3,2,2];
 gurls (Xtr, ytr, opt,1)
 gurls (Xte, yte, opt,2)

Linear regression, primal case, hold-out cv

 name = ’ExampleExperiment’; 
 opt = defopt(name); 
 opt.seq = {’paramsel:hoprimal’,’rls:primal’,’pred:primal’,’perf:rmse’}; 
 opt.process{1} = [2,2,0,0]; 
 opt.process{2} = [3,3,2,2]; 
 opt.hoperf = @perf_rmse; 
 gurls(Xtr, ytr, opt,1) 
 gurls(Xte, yte, opt,2) 

Here GURLS is used for regression. Note that the objective function is explicitly set to @perf_rmse, i.e. root mean square error, whereas in the first example opt.hoperf is set to its default @perf_macroavg which evaluates the average classification accuracy per class. The same code can be used for multiple output regression.

Gaussian Kernel classifier, leave one out cv

 name = 'ExampleExperiment';
 opt = defopt(name);
 opt.seq = {'paramsel:siglam', 'kernel:rbf', 'rls:dual', 'predkernel:traintest', 'pred:dual', 'perf:macroavg'};
 opt.process{1} = [2,2,2,0,0,0];
 opt.process{2} = [3,3,3,2,2,2];
 gurls (Xtr, ytr, opt,1)
 gurls (Xte, yte, opt,2)

Here parameter selection for gaussian kernel requires selection of both the regularization parameter &lambda and the kernel parameter &sigma, and is performed selecting the task siglam for the category paramsel. Once the value for kernel parameter σ has been chosen, the gaussian kernel is built through the kernel task with option rbf.

Gaussian kernel classifier, hold-out cv

 name = 'ExampleExperiment';
 opt = defopt(name);
 opt.seq = {'split:ho','paramsel:siglamho', 'kernel:rbf', 'rls:dual', 'predkernel:traintest', 'pred:dual', 'perf:macroavg'};
 opt.process{1} = [2,2,2,2,0,0,0];
 opt.process{2} = [3,3,3,3,2,2,2];
 gurls (Xtr, ytr, opt,1)
 gurls (Xte, yte, opt,2)

Linear classifier via stocastic gradient descent

 name = ’ExampleExperiment’; 
 opt = defopt(name); 
 opt.seq = {’paramsel:calibratesgd’,’rls:pegasos’,’pred:primal’,’perf:macroavg’}; 
 opt.process{1} = [2,2,0,0]; 
 opt.process{2} = [3,3,2,2]; 
 gurls(Xtr, ytr, opt,1) 
 gurls(Xte, yte, opt,2) 

Here the optimization is carried out using a stochastic gradient descent algorithm, namely Pegasos, Shalev-Shwartz, Singer and Srebro (2007).

Random features RLS classifier, hold-out cv

 name = ’ExampleExperiment’; 
 opt = defopt(name); 
 opt.seq = {’split:ho’,’paramsel:horandfeats’,’kernel:randfeats’,’rls:randfeats', ’pred:randfeats', 'perf:macroavg'} 
 opt.process{1} = [2,2,2,2,0,0]; 
 opt.process{2} = [3,3,3,3,2,2]; 
 gurls(Xtr, ytr, opt,1) 
 gurls(Xte, yte, opt,2) 

Computes a classifier for the primal formulation of RLS using the Random Features approach proposed by Rahimi and Recht (2007). In this approach the primal formulation is used in a new space built through random projections of the input data.

GURLS++ Usage

GURLS++ Main Classes

The GURLS class

The GURLS class implements the GURLS Core. Its only method run runs the learning pipeline and is the main method the user would directly call. It accepts exaclty four arguments:

  • The NxD input data matrix (N is the number of samples, D is the number of variables).
  • The NxT labels vector (T is the number of outputs. For (multi-class) classification, labels +1 and -1 must be in the One-Vs-All format)..
  • An options' structure.
  • A job-id number.
Each time the data need to be changed (e.g. going from training process to testing process) GURLS.run needs to be invoked again.

The GurlsOptionsList class

The options’ structure is built through the GurlsOptionsList class with default fields and values. The three main fields in the options’ structure are:

  • name: identifies the file where results shall be saved.
  • seq: specifies the (ordered) sequence of tasks, i.e. the pipeline, to be executed. Each task is defined by providing a task category and a choice amongst those available for that category, e.g. with "optimizer:rlsprimal" one sets the optimizer to be Regularized Least Squares in the primal space (see Section GURLS++ Available Methods to know the available categories and choices for each categories).
  • process: specifies what to do with each task. Possible instructions are:
    • ignore
    • compute
    • computeNsave
    • load
    • delete

Examples in GURLS++

In the ’demo’ directory you will find GURLSloocvprimal.cpp. The meaning of the demo is the following:

  • For the training data: calculate the regularization parameter minimizing classification accuracy via Leave-One-Out cross-validation and save the result, solve RLS for a linear classifier in the primal space and save the solution. Ignore the rest.
  • For the test data set, load the used (this is important if you want to save this value for further reference), load the classifier. Predict the output on the test-set and save it. Evaluate the average classification accuracy and as well as the precision-recall save them. In the following we report and comment the salient part of the demo.
First load the data from file. The training data is assumed to be stored in two .csv files, xtr_file and ytr_file, and the test data in two other .csv files, xte_file and yte_file:
 gMat2D<T> *Xtr, *Xte, *ytr, *yte;
 Xtr.readCSV(xtr_file);
 Xte.readCSV(xte_file);
 ytr.readCSV(ytr_file);
 yte.readCSV(yte_file);

then initialize an object of class GURLS and build an options’ list and by assigning it a name, in this case "Gurlslooprimal"

 GURLS G;
 GurlsOptionsList* opt = new GurlsOptionsList("Gurlslooprimal", true);

specify the task sequence

 OptTaskSequence *seq = new OptTaskSequence();
 *seq << "paramsel:loocvprimal" << "optimizer:rlsprimal"; << "pred:primal" << "perf:macroavg" << "perf:precrec";
 opt->addOpt("seq", seq);

initialize the process option

 GurlsOptionsList * process = new GurlsOptionsList("processes", false);

and define instructions for the training process

 OptProcess* process1 = new OptProcess();
 *process1 << GURLS::computeNsave << GURLS::computeNsave << GURLS::ignore << GURLS::ignore << GURLS::ignore;
 process->addOpt("one", process1);
 opt->addOpt("processes", process);

and testing process

 OptProcess* process2 = new OptProcess();
 *process2 << GURLS::load << GURLS::load << GURLS::computeNsave << GURLS::computeNsave << GURLS::computeNsave;
 process->addOpt("two", process2);

run gurls for training

 string jobId0("one");
 G.run(Xtr, ytr, *opt, jobId0);

run gurls for testing

 G.run(Xte, yte, *opt, jobId1);
 string jobId1("two");

Further Examples

The method run of class GURLS executes an ordered sequence of tasks, the pipeline, specified in the field seq of the options’ structure as

 {"<CATEGORY1>:<TASK1>";"<CATEGORY2>:<TASK2>";...}

These tasks can be combined in order to build different train-test pipelines. A list of the currently implemented GURLS tasks organized by category, is summarized in Table 1. In order to run the other examples you just have to substitute the code fragment for the task pipeline

 *seq << ...

and for the sequence of instructions for the training process

 *process1 << ...

and testing process

 *process2 << ...

with the desired task pipeline and instructions sequence. In the following we report the fragment of code defining he task sequence and the training and testing instructions some popular learning pipelines.

Linear classifier, primal case, hold-out cv

tasks pipeline

 *seq << "split:ho" << "paramsel:hoprimal" << "optimizer:rlsprimal";
 *seq << "pred:primal" << "perf:macroavg";

instructions sequences

 *process1 << GURLS::computeNsave << GURLS::computeNsave << GURLS::computeNsave;
 *process1 << GURLS::ignore << GURLS::ignore;
 *process2 << GURLS::load << GURLS::load << GURLS::load;
 *process2 << GURLS:: computeNsave << GURLS:: computeNsave;

Linear classifier, primal case, hold-out cv

tasks pipeline

 *seq <<"split:ho"<<"paramsel:siglamho"<<"kernel:rbf"<<"optimizer:rlsdual";
 *seq <<"pred:dual"<<"predkernel:traintest"<<"perf:macroavg";

instructions sequences

 *process1 <<GURLS::computeNsave<<GURLS::computeNsave<<GURLS::computeNsave;
 *process1 <<GURLS::computeNsave<<GURLS::ignore<<GURLS::ignore<<GURLS::ignore;
 *process2 <<GURLS::load<<GURLS::load<<GURLS::load<<GURLS::load;
 *process2 <<GURLS:: computeNsave<<GURLS:: computeNsave<< GURLS:: computeNsave;

Here parameter selection for gaussian kernel requires selection of both the RLS regularization parameter and the kernel parameter, and is performed selecting the task siglamho for the category paramsel. Once the value for the kernel parameter is chosen, the gaussian kernel is built through the kernel task category with choice rbf.

Customizing the Option Structure

The options structure passed as third input to GURLS.run has a set of default fields and values. Some of these fields can be manually changed as in the following line of code

 opt.addOpt("<FIELD>", <VALUE>);

where &amp;lt&#59;VALUE&amp;gt&#59; belongs the correct class of options amongst:

  • OptNumber
* OptString * OptFunction

Below we list the most important fields that can be customized

  • nlambda (OptNumber 20): number of values for the regularization parameter
* nsigma (OptNumber 25): number of values for the kernel parameter. * nholdouts (OptNumber 1): number of data splits to be used for hold-out CV. * hoproportion (OptNumber 0.2): proportion between training and validation set in parameter selection
  • hoperf (OptFunction &amp;quot&#59;macroavg&amp;quot&#59;): objective function to be used for parameter selection.
  • epochs (OptNumber 4): number of passes over the training set for stocastic gradient descent.
  • subsize (OptNumber 50): training set size used for parameter selection when using stocastic gradient descent.
  • singlelambda (OptFunction &amp;quot&#59;mean&amp;quot&#59;): function for obtaining one value for the regularization parameter, given the parameter choice for each class in multiclass classification (for each output in multiple output regression).

bGURLS Usage

The bGURLS package includes all the design patterns described for GURLS, and has been complemented with additional big data and distributed computation capabilities. Big data support is obtained using a data structure called bigarray, which allows to handle data matrices as large as a machine's available space on hard drive instead of its RAM: we store the entire dataset on disk and load only small chunks in memory when required.

bGURLS relies on a simple interface -- developed ad-hoc and called Gurls Distributed Manager (GDM) -- to distribute matrix-matrix multiplications, thus allowing users to perform the important task of kernel matrix computation on a distributed network of computing nodes. After this step, the subsequent tasks behave as in GURLS.

bGurls DesignbGurls Design

The bGURLS Core is identified with the bgurls command, which behaves as gurls. As gurls it accepts exactly four arguments:

  • the bigarray of the input data.
  • the bigarray of the labels vector.
  • An options' structure.
  • A job-id number.
The options' structure is built through the bigdefopt function with default fields and values. Most of the main fields in the options' structure are the same as in GURLS, however bgurls requires the options' structure to have the additional field files, which must be a structure with fields:
  • Xva_filename: the prefix of the files that constitute the bigarray of the input data used for validation
  • yva_filename: the prefix of the files that constitute the bigarray of the labels vector used for validation
  • pred_filename: the prefix of the files that constitute the bigarray of the predicted labels for the test set
  • XtX_filename: the name of the files where pre-computed matrix X'X is stored
  • Xty_filename: the name of the files where pre-computed matrix Xt'y is stored
  • XvatXva_filename: the name of the files where pre-computed matrix Xva'Xva is stored
  • Xvatyva_filename: the name of the files where pre-computed matrix Xva'yva is stored

bGURLS example

Let us consider the demo bigdemoA.m in the demo directory to better understand the usage of bGURLS. The demo computes a linear classifier with the regularization parameter chosen via hold-out validation, and then evaluate the prediction accuracy on a test set. The data set used in the demo is the bio data set used in Lauer and Guermeur 2011, which is saved in the demo directory as a .zip file, 'bio\_unique.zip', containing two files:

  • 'X.csv': containing the input nxd data matrix, where n is the number of samples (24,942) and d is the number of variables (68)
  • 'Y.csv': containing the input nx1 label vector
Note that the bio data is not properly a big data set, as it could reside in memory, however it is large enough to make it reasonable to use bGURLS.

In the following we examine the salient part of the demo in details. First unzip the data file

 unzip('bio_unique.zip','bio_unique')

and set the name of the data files

 filenameX = 'bio_unique/X.csv'; %nxd input data matrix
 filenameY = 'bio_unique/y.csv'; %nx1 or 1xn labels vector

Now set the size of the blocks for the bigarrays (matrices of size blocksizexd must fit into memory):

 blocksize = 1000; 

the fraction of total samples to be used for testing:

 test_hoproportion = .2;

the fraction of training samples to be used for validation:

 va_hoproportion = .2;  

and the directory where all processed data is going to be stored:

 dpath = 'bio_data_processed'; 

Now set the prefix of the files that will constitute the bigarrays

 mkdir(dpath)
 files.Xtrain_filename = fullfile(dpath, 'bigarrays/Xtrain');
 files.ytrain_filename = fullfile(dpath, 'bigarrays/ytrain');
 files.Xtest_filename = fullfile(dpath, 'bigarrays/Xtest');
 files.ytest_filename = fullfile(dpath, 'bigarrays/ytes');
 files.Xva_filename = fullfile(dpath, 'bigarrays/Xva');
 files.yva_filename = fullfile(dpath, 'bigarrays/yva');

and the name of the files where pre-computed matrices will be stored

 files.XtX_filename = fullfile(dpath, 'XtX.mat');
 files.Xty_filename = fullfile(dpath, 'Xty.mat');
 files.XvatXva_filename = fullfile(dpath,'XvatXva.mat');
 files.Xvatyva_filename = fullfile(dpath, 'Xvatyva.mat');

We are now ready to prepare the data for bGURLS. The following line of command reads files filenameX and filenameY blockwise -- thus avoiding to load all file at the same time-- and stores them in the bigarray format, after having split the data into train, validation and test set

 bigTrainTestPrepare(filenameX, filenameY,files,blocksize,va_hoproportion,test_hoproportion)

Bigarrays are now stored in the file names specified in the structure files. We can now precompute matrices that will be recursively used in the training phase, and store them in the file names specified in the structure files

 bigMatricesBuild(files)

The data set is now prepared for running the learning pipeline with the bgurls command. This phase behaves almost completely as in GURLS. The only differences are that:

  • we need not to load the data into memory, but simply 'load' the bigarray, that is load the information necessary to access the data blockwise.
  • we have to specify in the options' structure the path where the already computed matrix multiplications, and bigarrays for validation data are stored.
Let us first define the option structure as in GURLS
 name = fullfile(wpath,'gurls');
 opt = bigdefopt(name);
 opt.seq = {'paramsel:dhoprimal','rls:dprimal','pred:primal','perf:macroavg'};
 opt.process{1} = [2,2,0,0];
 opt.process{2} = [3,3,2,2];

Note that no task is defined for the split category, as data has already been split in the preprocessing phase and bigarrays for validation were built. In the following fragment of code we add to the options' structure the information relative to the already computed matrix multiplications and to the validation bigarrays

 opt.files = files;
 opt.files = rmfield(opt.files,{'Xtrain_filename';'ytrain_filename';'Xtest_filename';'ytest_filename'}); %not used by bgurls
 opt.files.pred_filename = fullfile(dpath, 'bigarrays/pred');

Note that we have also defined where the predicted labels shall be stored as bigarray.

Now we have to 'load' bigarrays for training

 X = bigarray.Obj(files.Xtrain_filename);
 y = bigarray.Obj(files.ytrain_filename);	
 X.Transpose(true);
 y.Transpose(true);

and run bgurls on the training set

 bgurls(X,y,opt,1)

In order to run the testing process, we first have to 'load' bigarrays variables for test data

 X = bigarray.Obj(files.Xtest_filename);
 y = bigarray.Obj(files.ytest_filename);	
 X.Transpose(true);
 y.Transpose(true);

and then we can finally run bgurls on the test set

 bgurls(X,y,opt,2);

Now you should have a mat file named 'gurls.mat' in your path. This file contains all the information about your experiment. If you want to see the mean accuracy, for example, load the file in your workspace and type

 >> mean(opt.perf.acc)

If you are interested in visualizing or printing stats and facts about your experiment, check the documentation about the summarizing functions in the gurls package.

Dealing with other data formats

Other two demos can be found in the 'demo' directory. The three demos differ in the format of the input data, as we tried to provide examples for the most common data formats. The data set used in bigdemoB is again the bio data set, though in a slightly different format as it is already split into train and test data. The bigTrainPrepare and bigTestPrepare take care of preparing the train and test set separately.

The data set used in bigdemoC is the ImageNet data set, which is automatically downloaded from http://bratwurst.mit.edu/sbow.tar, when running the demo. This data set is stored in 1000 .mat files where the i-th file contains the variable x which is a dxn_i input data matrix for the n_i samples of class i. The bigTrainTestPrepare_manyfiles takes care of preparing the bigarrays for the ImageNet data format. Note that, while the bio data is not properly a big data set, the ImageNet occupies about 1G of RAM and can thus be called a big data set.

In order to run bGURLS on other data formats, one can simply use bigdemoA after having substituted the line

 bigTrainTestPrepare(filenameX, filenameY,files,blocksize,va_hoproportion,test_hoproportion)

with a suitable fragment of code. The remainder of the data preparation, that is the computation and storage of the relevant matrices, can be left unchanged.

bGURLS++

The usage of bGURLS ++ is very similar to that of GURLS++, with the following exceptions:

  • the Gurls Core is implemented via the BGURLS class instead of the GURLS one;
  • The first two inputs of BGURLS must be bigarrays (a data structure which allows to handle data matrices as large as a machine’s available space on hard drive instead of its RAM) rather than matrices;
  • The options structure must be of class BGurlsOptionsList rather than GurlsOptionsList;
  • The only allowed "big" task categories for bGURLS++ are bigsplit, bigparamsel, bigoptimizer, bigpred and bigperf;

bGURLS++ Example+

Let us consider the demo bigmedmo.cpp in the demo subdirectory to better understand the usage of the bGURLS++ module. The data set used in the demo is the bio data set used in [Lauer], which is saved in the demo directory as a .zip file, ’bio_traintest_csv.zip’, containing four files:

  • ’Xtr.csv’: containing the NtrxD training data matrix, where Ntr is the number of training samples and D is the number of variables;
  • ’Ytr.csv’: containing the Ntrx1 training label vector;
* ’Xte.csv’: containing the NtexDd test data matrix, where Nte is the number of test samples; * ’Yte.csv’: containing the Ntex1 test label vector;

Differently from GURLS++, we chose the HDF5 data format to store matrices as it easily allows to read the content of the files by blocks. Let us now examine the salient and distinctive part of the demo.

The data is loaded as bigarray (actually only the information realtive to the data, not the data itself) with the following fragment of code:

 BigArray<T> Xtr(path(shared_directory / "Xtr.h5").native(), 0, 0);
 Xtr.readCSV(path(input_directory / "Xtr.csv").native());
 BigArray<T> Xte(path(shared_directory / "Xte.h5").native(), 0, 0);
 Xte.readCSV(path(input_directory / "Xte.csv").native());
 BigArray<T> ytr(path(shared_directory / "ytr.h5").native(), 0, 0);
 ytr.readCSV(path(input_directory / "ytr.csv").native());
 BigArray<T> yte(path(shared_directory / "yte.h5").native(), 0, 0);
 yte.readCSV(path(input_directory / "yte.csv").native());

The options’ structure is built with default values via the following line of code:

 BGurlsOptionList opt("bio_demoB", shared_directory.native(),true);

The pipeline is built as in GURLS++, though with the bGURLS ++ task categories

 OptTaskSequence *seq = new OptTaskSequence();
 *seq<<"bigsplit:ho"<<"bigparamsel:hoprimal"<<"bigoptimizer:rlsprimal" << "bigpred:primal" << "bigperf:macroavg";
 opt.addOpt("seq",seq);

The two sequences of actions identifying the training and test processes are defined exactly as in GURLS++, whereas the processes are run through the BGURLS method as in the following:

 BGURLS G;
 G.run(Xtr,ytr,opt,jobid1);
 G.run(Xte,yte,opt,jobid2);

Results visualization

You can visualize the results of one or more experiments (i.e. GURLS pipelines) using the summary_&amp;&#35;42&#59; functions. Below we show the usage of these set of functions for two sets of experiments each one run 5 times. First we have to run the experiments. nRuns contains the number of runs for each experiment, and filestr contains the names of the experiments.

 nRuns = {5,5}; 
 filestr = {’hoprimal’; ’hodual’}; 
 for i = 1:nRuns{1}; 
   opt = defopt(filestr{1} ’_’ num2str(i)]; 
   opt.seq = {’paramsel:loocvprimal’,’rls:primal’,’pred:primal’,’perf:macroavg’,’perf:precrec’}; 
   opt.process{1} = [2,2,0,0,0]; 
   opt.process{2} = [3,3,2,2,2]; 
   gurls(Xtr, ytr, opt,1) 
   gurls(Xte, yte, opt,2) 
 end 
 for i = 1:nRuns{2}; 
   opt = defopt(filestr{2} ’_’ num2str(i)]; 
   opt.seq = {’kernel:linear’, ’paramsel:loocvdual’,’rls:dual’, ’pred:dual’, ’perf:macroavg’, ’perf:precrec’}; 
   opt.process{1} = [2,2,2,0,0,0]; 
   opt.process{2} = [3,3,3,2,2,2]; 
   gurls(Xtr, ytr, opt,1) 
   gurls(Xte, yte, opt,2) 
 end 

In order to visualize the results we have to specify in fields which fields of opt are to be displayed (as many plots as the elements of fields will be generated)

 >> fields = {’perf.ap’,’perf.acc’}; 

we can generate "per-class" plots with the following command:

 >> summary_plot(filestr,fields,nRuns) 

and “global” plots with:

 >> summary_overall_plot(filestr,fields,nRuns) 

this generates “global” table:

 >> summary_table(filestr, fields, nRuns) 

This plots times taken by each step of the pipeline for performance reference:

 >> plot_times(filestr,nRuns)
Clone this wiki locally