-
Notifications
You must be signed in to change notification settings - Fork 37
2 Getting Started
Download the source from the Git repository http://github.com/CBCL/GURLS
GURLS is a pure Matlab library and has no specific dependencies on external libraries, made exception for the stats toolbox. Once the compressed archive has been downloaded on your machine, you need to save it in the desired `PACKAGE_ROOT`. Then open MATLAB and execute:
>> run('PACKAGE_ROOT/gurls/utils/gurls_install.m');
This will add all the important directories to your path. Run savepath if you want the installation to be permanent.
bGURLS is a pure Matlab library and has no specific dependencies on external libraries, made exception for the GURLS library and the stats toolbox. Once the compressed archive has been downloaded on your machine, you need to save it in the desired `PACKAGE_ROOT`. Then open MATLAB and execute:
>> run('PACKAGE_ROOT/bgurls/utils/bgurls_install.m');
This will add all the important directories to your path. Run savepath if you want the installation to be permanent.
You can download the precompiled binaries here and skip to the next section, however such a version may be not optimized for your machine.
Conversely, if you want a version of GURLS++ and BGURLS++ optimized for your machine, follow the step described in the rest of the section.
GURLS++ and BGURLS++ are part of the same project, called gurls. Users may choose what libraries will be built during project configuration (See section Configuring GURLS++/bGURLS++ for details). In the following we assume that the directory where "gurls++" and "bgurls++" directories reside is named GURLSROOT.
GURLS++ depends on several external libraries:
- A Blas/Lapack implementation. Currently we support:
- AMD’s ACML (only for 64 bits);
- ATLAS;
- Intel’s MKL;
- Netlib’s reference implementation: http://www.netlib.org/blas and http://www.netlib.org/lapack/ ;
- OpenBLAS (Currently only under Linux);
- Boost’s (v1.46.0 or higher) libraries serialization, date_time, filesystem, unit_test_framework, system, signals.
In addition to the GURLS++ dependencies, bGURLS++ also depends on:
- An MPI implementation. BGURLS++ has been successfully tested with MPICH http://www.mpich.org/;
- Zlib http://www.mpich.org/;
- LibHDF5 v1.8.9 http://www.hdfgroup.org/HDF5/ compiled enabling parallel support and zlib (Libhdf5 v1.8.10 or higher is proved to not work properly);
The GURLS++/bGURLS++ CMake configurator suppports automatic downloading and building of all dependencies by setting the GURLS_USE_EXTERNALS variable to ON
(See section Configuring GURLS++/bGURLS++ below for details). Due to licence restrictions, on Windows this "Superbuild" system does not permit to automatically install a blas/lapack implementation so users must install manually blas/lapack libraries.
Below we describe how to build and install GURLS++ on Ubuntu (tested on Ubuntu 12.04). For other distributions, the same packages must be installed with the distribution-specific method.
1. Install the cmake build system (www.cmake.org/)
$ sudo apt-get install cmake cmake-curses-gui
2. To link against some Blas and Lapack implementations you may need a fortran compiler e.g. for gfortran:
$ sudo apt-get install gfortran
3. Create a build directory (e.g. "build") for GURLS++
$ cd $GURLS_ROOT $ mkdir build
4. Run cmake into the build directory
$ cd build $ ccmake ..
The last command will show the CMake interface, which must be used to set the values of some variables used for building and installing GURLS++. See the section Configuring GURLS++/bGURLS++ below for more information on these variables and how to set them to appropriate values.
5. Start building
$ make
6. Install the library(ies) to the path defined at configuration time
$ make install
The command wil also install to the same path all the dependencies that user chose to build automatically.
Below we describe how to build and install GURLS++ on Windows with Visual Studio (tested with VS Express 2010 and VS Express 2008).
- Install the CMake build system downloading the installer from http://cmake.org/cmake/resources/software.html.
- Install your favourite Blas/Lapack implementation. Under Windows AMD’s ACML is probably the easiest choice, since they provide the library binaries for free, however it supports only compilers with 64 bits.
- Create a build directory (e.g. $GURLSROOT/build).
- Run the CMake GUI. You will have to set the source directory to $GURLSROOT directory, and the build directory to the directory created at the previous step. After pressing the configure button, you will have to chose the generator for the project (e.g. Visual Studio 10). On Windows 7 you may encounter the error message "error in configuration process, project files may be invalid", check that the you have writing rights to the path specified in the variable
CMAKE_INSTALL_PREFIX
. If this is not the case, change such a variable to a folder to which you have writing rights and press 'configure'. Now you have to set the values of some variables used for building and installing the libraries according to your preferences. See the section Configuring GURLS++/bGURLS++ below for more information on these variables and how to set them to appropriate values. After having configured the build options, press the generate button to create the solution file. - Open the generated solution under Visual Studio and build it.
- Install GURLS++ by explicitely building the install project included in the solution (it is not automatically built when building the solution).
The configuration step is carried out using CMake. In the following we describe the configuration process using the GUI of CMake, e.g. under Windows or Mac. A similar process shall be followed when using the command-line interface.
- Press 'configure', and CMake will try to determine the correct values for all variables. After the first configuration a list of variables is displayed. The following variables should be checked:
-
CMAKE_INSTALL_PREFIX
The path where the library will be installed to; -
GURLS_BUILD_GURLSPP (ON)
: Build GURLS++.If set to ON CMake also evaluates the variables-
GURLSPP_BUILD_DEMO (ON)
: Enable the building of the GURLS++demo programs; -
GURLSPP_BUILD_DOC (OFF)
: Enable the building of the GURLS++documentation using doxygen;
-
-
GURLS_BUILD_BGURLSPP (ON)
: Build bGURLS++. If set toON
CMake also evaluates the variables-
BGURLSPP_BUILD_DEMO (OFF)
: Enable the building of the BGURLS++demo programs; -
BGURLSPP_BUILD_DOC (OFF)
: Enable the building of the BGURLS++documentation using doxygen;
-
-
GURLS_USE_BINARY_ARCHIVES (ON)
: If set toON
, all data structures are stored in binary (rather than text) files, saving storage space and time; -
GURLS_USE_EXTERNALS (ON)
: Enable automatic building of external dependencies.
- If set to
ON
CMake also evaluates the variables- -
GURLS_USE_EXTERNAL_BLAS_LAPACK (ON)
: Enable automatic building of blas and lapack, using OpenBLAS (LINUX-ONLY). - -
GURLS_USE_EXTERNAL_BOOST (ON)
: Enable automatic building of boost. If set toOFF
you typically need to press 'advanced' and a set of variables will appear related to the BOOST library. You have to specify only the variableBOOST_INCLUDE_DIR
; - -
GURLS_USE_EXTERNAL_HDF5 (OFF)
: Enable automatic building of libHDF5 and its dependencies(MPICH and zlib). Used only ifGURLS_BUILD_BGURLSPP
is set toON
. - For each variable which is set to
OFF
, you must specify the path to the corresponding library.
- -
- If
GURLS_USE_EXTERNALS
is set toOFF
you have to manually specify the path to all of the above libraries.
- If set to
4. In the main screen you may change a number of variables. Most of them can be left unchanged, but some must be set to appropriate values. The following are the variables whose values should be checked:
-
BLAS_LAPACK_IMPLEMENTATION
. Allows user to specify an implementation of the Blas/Lapack routines. Available choices are:ACML, ATLAS, MKL, NETLIB, OPENBLAS
(under linux). Depending on the choice you make, CMake will try to find the libraries in standard locations in the system. Normally this process should run fine, however, in case the libraries have been installed in some non-standard directory, you may have to manually specify their location.
6. When the settings are correct, the option ’generate’ will appear. Press 'generate'. CMake will generate the files and exit.
After the build files (e.g. the Makefile under Linux) have been generated, you can proceed as explained above.
Have a look, and run gurls_helloworld.m in the 'demo' subdirectory. Below we describe the demo in details.
We first have to load the training data
>> load('data/quickanddirty_traindata;')
and train the classifier
>> [opt] = gurls_train(Xtr,ytr);
now we load the test data
>> load('data/quickanddirty_testdata');
then we predict the labels for the test set and asses prediction accuracy
>> [yhat,acc] = gurls_test(Xte,yte,opt);
Have a look, and run helloworld.cpp in the 'demo' subdirectory. Below we describe the salient parts of demo in details. First we have to load the training data
Xtr = readFile<T>("../data/Xtr.txt"); ytr = readFile<T>("../data/ytr_onecolumn.txt");
and the test data
Xte = readFile<T>("../data/Xte.txt"); yte = readFile<T>("../data/yte_onecolumn.txt");
then we train the classifer
GurlsOptionsList* opt = gurls_train(*Xtr, *ytr);
finally we predict the labels for the test set and asses prediction accuracy
gurls_test(*Xte, *yte, *opt);
GURLS is distributed under the BSD license. This means that it is free for both academic and commercial use. If you are going to use GURLS in your scientific work, please cite the toolbox, the main website and the paper:
- Tacchetti, A., P. Mallapragada, M. Santoro, and L. Rosasco; GURLS: a Toolbox for Large Scale Multiclass Learning, presented at Workshop: "Big Learning: Algorithms, Systems, and Tools for Learning at Scale" at NIPS 2011, December 16-17 2011, Sierra Nevada, Spain.