-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathREADME
91 lines (82 loc) · 5.35 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
=========================== Project info ============================
This is a proof-of-concept project to use it by ZeroVM messaging;
This project is showing how can be implemented Distributed Sort using specific client-server technology;
To do nodes networking client using special communication files, which can be read only and write only.
Applications:
Server : Fuse server + Zeromq I/O engine. It's create filesystem in memory, currently it has static filesystem
structure that loading at startup from sqlite database file 'server_dsort.db';
Client : User application, required own running server, one server for single user app.
It can communicate with another clients by using files in same manner as communication sockets with read or write ability;
Filesystem : It is a layer between user application and server provides networking between client apps;
It should be mounted by server before client can use it and access networking;
How it works :
Every user application uses filesystem in memory hosted by server app; Files can be opened in write or read mode,
and every file is associated with networking socket; If client attempt 'write' data to file hosted by server FS,
then handled data would be send by ZeroMQ networking socket using transport and endpoint configured in 'server_dsort.db';
To get more information about sorting job see "Distributed-Sort" on github. For example it's an actual diagram:
https://github.com/YaroslavLitvinov/Distributed-Sort/blob/master/zeromq-diagram.pdf
=========================== Required Packages =========================
Ubuntu Linux 11.x 64bit
Required packages: ZeroMQ 2.1.x, Fuse 2.9.0;
Recommended package : sqlite-autoconf-3071100, Executive file sqlite3 can be useful to create and insert data into databases;
sqlite-amalgamation already is part of project and located in sqlite dir;
in terminal >
echo Install sqlite-autoconf
wget http://www.sqlite.org/sqlite-autoconf-3071200.tar.gz
tar xvfz sqlite-autoconf-3071200.tar.gz
cd sqlite-autoconf-3071200/
make
sudo make install
echo to be sure about sqlite3
sudo ldconfig
echo install packages required by zeromq
sudo apt-get install uuid-dev
echo Install zeromq
wget http://download.zeromq.org/zeromq-2.2.0.tar.gz
tar xvfz zeromq-2.2.0.tar.gz
cd zeromq-2.2.0
./configure
make
sudo make install
echo Install fuse
sudo apt-get install libfuse-dev
============================= Run Distributed Sort =========================
Currently by default you can build and run sorting with 5 source and 5 destination nodes;
Explicitly, you are need to set path to sqlite-autoconf in db_creator.sh
You can run Distributed sort folowing this, in terminal >
make
echo create db files
./db_creator.sh
./client-server-distributed-sort-5.sh /tmp/1/5
echo To see results:
cat log/clientmanager1.log
It is recommended to set as parameter an different path at every exec of sort script, because sometimes FUSE
for some reason does not unmount the folder, and then the sort can not properly run;
============================ Project files ============================
zf-server - Server that hosts filesystem in memory, and contains ZeroMQ networking engine;
Single instance of server is required by every client app;
test_node - Client that produces simple communication tests: test1PUSH-PULL.sh, test2REQ-REP.sh test scripts uses this node;
To perform communication test at least two test_nodes should be launched;
main_node - Client of distributed sort that acts as single manager node, and one instance is required;
src_node - Client of distributed sort that acts as source node, Number of src nodes can be configured
by SRC_NODES_COUNT from defines.h, at least 3 nodes should be set;
dst_nodes - Same as src_node, but act as destination node with predefined instances number;
client-server-distributed-sort-5.sh - Launch distributed sort; It creates required server and client instances,
and launch every in dedicated terminal, that displays some progress;
sql/db_client_dsort.sql Client db contains filelist on server FS that can be used by client-server for communacations;
ql/db_server_dsort.sql Server db contains filelist on server FS and related transport,endpoint data for networking;
db_creator.sh it use sqlite3 app, and .sql files as source from target db;
============================= Logging ==================================
Logs are located in log/ dir.
Every app servers and clients are logging into own log file and log filename contains text 'server' or 'client' and node id number;
In logfile.h you can set log detailing by changing LOG_LEVEL defined value;
============================ Data Flow =============================
1.Every of 5 source nodes at the launch simply generates 1000000 4bytes integers to use it as source for distributed sorting.
2.Every source node locally sorting this data;
3.At the next source node is creating Histogram of locally sorted data, where starting from 0 item every 1000 item is added into histogram;
4.All histograms from 5 source nodes are sending to manager node to analize it;
5.Manager node analizing received histograms and during of analize requests detailed histograms from every source nodes.
In fact detailed histograms is simple raw sorting data near to future bounds of destination sorted arrays;
6.As result of processed histograms we have start and end bounds of future resulted source arrays;
Every source array is divided into N parts, where N - is source nodes count;
...