-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathREADME.txt
269 lines (202 loc) · 9.75 KB
/
README.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
Huahin Manager
Huahin Manager is a Simple Management System for Hadoop MapReduce Job.
Huahin can get a list of MapReduce jobs, get status, do a kill for the job, and Job queue management.
Huahin Manager is distributed under Apache License 2.0.
-----------------------------------------------------------------------------
Documentation
http://huahinframework.org/huahin-manager/
-----------------------------------------------------------------------------
Requirements
* Java 6+
-----------------------------------------------------------------------------
Install Huahin Manager
~ $ tar xzf huahin-manager-x.x.x.tar.gz
-----------------------------------------------------------------------------
Configure Huahin Manager
Edit the huahin-manager-x.x.x/conf/huahinManager.properties file and set mapred.job.tracker property to the JobTracker URI,
set fs.default.name property to the NameNode URI, and set job.queue.limit property to the job queue limit.
job queue limit is 0, does not manage the queue.
For 0.1.X example:
mapred.job.tracker=jobtracker:9001
fs.default.name=hdfs://namenode:9000
hiveserver=hiveserver:10000 # option
job.queue.limit=2
For 0.2.X example:
yarn.resourcemanager.address=resourcemanager:8032
mapreduce.jobhistory.address=jobhistory:10020
fs.defaultFS=hdfs://namenode:8020
yarn.resourcemanager.webapp.address=resourcemanager:8088
yarn.nodemanager.webapp.address=nodemanager:8042
yarn.web-proxy.address=web-proxy:8100
mapreduce.jobhistory.webapp.address=jobhistory:19888
# option: if you do not set it will be the default.
yarn.application.classpath=$HADOOP_CONF_DIR,$HADOOP_COMMON_HOME/*,...
# option
hiveserver=hiveserver:10000
# option
hiveserver.version=2
job.queue.limit=2
For 0.2.X-mr1 example:
mapreduce.jobtracker.address=jobtracker:8021
fs.default.name=hdfs://namenode:9000
hiveserver=hiveserver:10000 # option
hiveserver.version=2 # option
job.queue.limit=2
When you change the boot port, edit the huahin-manager-x.x.x/conf/port file.
-----------------------------------------------------------------------------
Start/Stop Huahin Manager
To start/stop Huahin Manager use Huahin Manager's bin/manager script. For example:
$ bin/manager start
-----------------------------------------------------------------------------
Test Huahin Manager is working
~ $ curl -X GET "http://<HOSTNAME>:9010/job/list"
[
{
"jobid": "job_201205111223_0001",
"mapComplete": "100.0%",
"name": "JOB_EBCF7626A41F34B4C7276DB2B152336F",
"priority": "NORMAL",
"reduceComplete": "100.0%",
"schedulingInfo": "NA",
"startTime": "Fri May 11 12:25:18 JST 2012",
"state": "SUCCEEDED",
"user": "huahin"
}
]
-----------------------------------------------------------------------------
Huahin Manager REST Job APIs
Get all job list.
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/job/list"
Get failed job list.
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/job/list/failed"
Get killed job list.
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/job/list/killed"
Get prep job list.
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/job/list/prep"
Get running job list.
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/job/list/running"
Get succeeded job list.
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/job/list/succeeded"
Get job status.
<JOBID> specifies the jobid.
~ $ curl -X GET "http://<HOSTNAME>:9010/job/status/<JOBID>"
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/job/status/job_201205111223_0001"
Get job detail.
<JOBID> specifies the jobid.
~ $ curl -X GET "http://<HOSTNAME>:9010/job/detail/<JOBID>"
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/job/detail/job_201205111223_0001"
Register job
JAR=@<JAR_FILE> specifies the run jar file.
ARGUMENTS specifies the JSON. <CLASS> specifies the run class. arguments:<ARGS> specifies the run arguments array.
~ $ curl -X POST "http://<HOSTNAME>:9010/job/register -F JAR=@<JAR_FILE> -F ARGUMENTS='{"class":"<CLASS>","arguments":["<ARGS>","<ARGS>"]}'
For example:
~ $ curl -X POST "http://<HOSTNAME>:9010/job/register -F [email protected] -F ARGUMENTS='{"class":"examples.WordCount","arguments":["/user/huahin/input","/user/huahin/output"]}'
Register Hive job
Because they are executed in the queue, the return value must be a table or HDFS.
ARGUMENTS specifies the JSON. <script> specifies the hive query.
~ $ curl -X POST "http://<HOSTNAME>:9010/job/hive/register -F ARGUMENTS='{"script":"<script>"}'
For example:
~ $ curl -X POST "http://<HOSTNAME>:9010/job/hive/register" \
-F ARGUMENTS='{"script":"insert overwrite directory '\''/tmp/out'\'' select word, count(word) as cnt from words group by word"}'
Register Pig job
Because they are executed in the queue, the return value must be a table or HDFS.
ARGUMENTS specifies the JSON. <script> specifies the Pig Latin.
~ $ curl -X POST "http://<HOSTNAME>:9010/job/pig/register -F ARGUMENTS='{"script":"<script>"}'
For example:
~ $ curl -X POST "http://<HOSTNAME>:9010/job/pig/register" \
-F ARGUMENTS='{"script":"a = load '\''/user/huahin/input'\'' as (text:chararray);b = foreach a generate flatten(TOKENIZE(text)) as word;c = group b by word;d = foreach c generate group as word, COUNT(b) as count;store d into '\''/tmp/out'\'';"}'
Kill job for ID.
<JOBID> specifies the job ID.
~ $ curl -X DELETE "http://<HOSTNAME>:9010/job/kill/id/<JOBID>"
For example:
~ $ curl -X DELETE "http://<HOSTNAME>:9010/job/kill/id/job_201205111223_0001"
Kill job for job name.
<JOBNAME> specifies the job name.
~ $ curl -X DELETE "http://<HOSTNAME>:9010/job/kill/name/<JOBNAME>"
For example:
~ $ curl -X DELETE "http://<HOSTNAME>:9010/job/kill/name/WORD_COUNT_JOB"
-----------------------------------------------------------------------------
Huahin Manager REST queue APIs
Get all queue list.
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/queue/list"
Get all queue statuses.
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/queue/statuses"
Kill queue for ID.
<QUEUEID> specifies the queue ID.
~ $ curl -X DELETE "http://<HOSTNAME>:9010/queue/kill/<QUEUEID>"
For example:
~ $ curl -X DELETE "http://<HOSTNAME>:9010/queue/kill/Q_20120608180129594"
-----------------------------------------------------------------------------
Huahin Manager REST Hive APIs
Execution of the query. If it have a return value, it will be returned along with the number of executed query.
ARGUMENTS specifies the JSON. <query> specifies the hive query.
~ $ curl -X POST "http://<HOSTNAME>:9010/hive/execute -F ARGUMENTS='{"query":"<query>"}'
For example:
~ $ curl -X POST "http://<HOSTNAME>:9010/hive/execute" \
-F ARGUMENTS='{"query":"create table foo(bar string)"}'
‾ $ curl -X POST "http://<HOSTNAME>:9010/hive/execute" ¥
-F ARGUMENTS='{"query":"create table foo(bar string); insert overwrite table foo select * from words limit 100;"}'
** Notice **
This method is deprecate.
Query execution with return value
The return value is returned in the stream.
ARGUMENTS specifies the JSON. <query> specifies the hive query.
~ $ curl -X POST "http://<HOSTNAME>:9010/hive/executeQuery -F ARGUMENTS='{"query":"<query>"}'
For example:
~ $ curl -X POST "http://<HOSTNAME>:9010/hive/executeQuery" \
-F ARGUMENTS='{"query":"select word, count(word) as cnt from words group by word"}'
-----------------------------------------------------------------------------
Huahin Manager REST Pig APIs
Execution of the dump
ARGUMENTS specifies the JSON. <variable> is that specifies the dump. <query> specifies the Pig Latin.
~ $ curl -X POST "http://<HOSTNAME>:9010/pig/dump -F ARGUMENTS='{"dump":"<variable>","query":"<query>"}'
For example:
~ $ curl -X POST "http://<HOSTNAME>:9010/pig/dump" \
-F ARGUMENTS='{"dump":"d","query":"a = load '\''/user/huahin/input'\'' as (text:chararray);b = foreach a generate flatten(TOKENIZE(text)) as word;c = group b by word;d = foreach c generate group as word, COUNT(b) as count;"}'
Execution of the store
The return value is returned in the stream.
ARGUMENTS specifies the JSON. <query> specifies the Pig Latin.
~ $ curl -X POST "http://<HOSTNAME>:9010/pig/store -F ARGUMENTS='{"query":"<query>"}'
For example:
~ $ curl -X POST "http://<HOSTNAME>:9010/pig/store" \
-F ARGUMENTS='{"query":"a = load '\''/user/huahin/input'\'' as (text:chararray);b = foreach a generate flatten(TOKENIZE(text)) as word;c = group b by word;d = foreach c generate group as word, COUNT(b) as count;store d into '\''/tmp/out'\'';"}'
-----------------------------------------------------------------------------
For 0.2.X
-----------------------------------------------------------------------------
Huahin Manager REST YARN APIs
http://hadoop.apache.org/docs/r2.0.2-alpha/hadoop-yarn/hadoop-yarn-site/WebServicesIntro.html
ResourceManager REST API's
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/api/rm/ws/v1/cluster/info"
NodeManager REST API's
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/api/nm/ws/v1/node/info"
MapReduce Application Master REST API's
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/api/proxy/{appid}/ws/v1/mapreduce/info"
History Server REST API's
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/api/history/ws/v1/history/info"
-----------------------------------------------------------------------------
Huahin Manager REST Application APIs
Get all application list.
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/application/list"
Get cluster info.
For example:
~ $ curl -X GET "http://<HOSTNAME>:9010/application/cluster"
Kill application for ID.
<appid> specifies the application ID.
~ $ curl -X DELETE "http://<HOSTNAME>:9010/application/kill/<appid>"
For example:
~ $ curl -X DELETE "http://<HOSTNAME>:9010/application/kill/application_1326232085508_0003"