Husky is a distributed computing system designed to handle mixed jobs of coarse-grained transformations, graph computing and machine learning. The core of Husky is written in C++ so as to leverage the performance of native runtime. For machine learning, Husky supports relaxed consistency level and asynchronous computing in order to exploit higher network/CPU throughput.
For more details about Husky, please check our Wiki.
Husky has the following minimal dependencies:
- CMake (Version >= 3.0.2)
- ZeroMQ (including both libzmq and cppzmq)
- Boost (Version >= 1.58)
- A working C++ compiler (clang/gcc Version >= 4.9/icc/MSVC)
- TCMalloc (In gperftools)
Some optional dependencies:
- libhdfs3 C/C++ HDFS Client
- MongoDB C++ Driver (Version legacy 1.1.2)
Download the Husky source code:
git clone https://github.com/husky-team/husky.git
We assume the root directory of Husky is $HUSKY_ROOT
. Go to $HUSKY_ROOT
and do a out-of-source build using CMake:
cd $HUSKY_ROOT
mkdir release
cd release
cmake -DCMAKE_BUILD_TYPE=Release ..
make help # List all build target
make -j8 Master # Build the Husky master
make $ApplicationName # Build the Husky application
Husky is supposed to run on any platform. Configurations can be stored in a configure file(INI format) or can be the command arguments when running Husky. An example file for configuration is like the following:
# Required
master_host=xxx.xxx.xxx.xxx
master_port=yyyyy
comm_port=yyyyy
# Optional
hdfs_namenode=xxx.xxx.xxx.xxx
hdfs_namenode_port=yyyyy
# For Master
serve=1
# Session for worker information
[worker]
info=master:3
For single-machine environment, use the hostname of the machine as both the master and the (only) worker.
For distributed environment, first copy and modify $HUSKY_ROOT/exec.sh
according to actual configuration. exec.sh
depends on pssh
.
Run ./Master --help
for helps. Check the examples in examples
directory.
First make sure that the master is running. Use the following to start the master
./Master --conf /path/to/your/conf
In the distributed environment, use the following to execute workers on all machines,
./exec.sh <executable> /path/to/your/conf
In the single-machine environment, use the following,
./<executable> --conf /path/to/your/conf
Husky contains the unit tests(basic on gtest 1.7.0 in the core components. Run it with:
make HuskyUnitTest
./HuskyUnitTest
Do the following to generate API documentation,
doxygen doxygen.config
Then go to html/ for HTML documentation, and latex/ for LaTeX documentation
Copyright 2016 Husky Team
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.