-
Notifications
You must be signed in to change notification settings - Fork 1
/
13.2rmconfiguration.html
342 lines (214 loc) · 14.4 KB
/
13.2rmconfiguration.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta name="generator" content="HTML Tidy for Linux (vers 25 March 2009), see www.w3.org">
<title></title>
</head>
<body>
<div class="sright">
<div class="sub-content-head">
Maui Scheduler
</div>
<div id="sub-content-rpt" class="sub-content-rpt">
<div class="tab-container docs" id="tab-container">
<div class="topNav">
<div class="docsSearch"></div>
<div class="navIcons topIcons">
<a href="index.html"><img src="home.png" title="Home" alt="Home" border="0"></a> <a href="13.0rmandinterfaces.html"><img src="upArrow.png" title="Up" alt="Up" border="0"></a> <a href="13.1rmoverview.html"><img src="prevArrow.png" title="Previous" alt="Previous" border="0"></a> <a href="13.3rmextensions.html"><img src="nextArrow.png" title="Next" alt="Next" border="0"></a>
</div>
<h1>13.2 Resource Manager Configuration</h1>
<h2>13.2.1 Configurable Resource Manager Attributes</h2>
<p>The scheduler's resource manager interface(s) are defined using the <a href="a.fparameters.html#rmcfg">RMCFG</a> parameter. This parameter allows specification of key aspects of the interface as shown in the table below.</p>
<table border="2">
<tr>
<td><b>Attribute</b></td>
<td><b>Format</b></td>
<td><b>Default</b></td>
<td><b>Description</b></td>
<td><b>Example</b></td>
</tr>
<tr>
<td><b>ASYNCJOBSTART</b></td>
<td><BOOLEAN></td>
<td><b>FALSE</b></td>
<td>Enables Maui to start jobs asynchronously. <b>NOTE</b>: Only enabled for PBS.</td>
<td><tt>RMCFG[pbs] ASYNCJOBSTART=TRUE<br>
<br></tt></td>
</tr>
<tr>
<td><b>AUTHTYPE</b></td>
<td>one of <b>CHECKSUM</b>, <b>PKI</b>, or <b>SECUREPORT</b></td>
<td><b>CHECKSUM</b></td>
<td>specifies the security protocol to be used in scheduler-resource manager communication. <b>NOTE</b>: Only valid with WIKI based interfaces.</td>
<td>
<tt>RMCFG[base] AUTHTYPE=CHECKSUM</tt>
<p>(The scheduler will require a secret key based checksum associated with each resource manager message)</p>
</td>
</tr>
<tr>
<td><b>CONFIGFILE</b></td>
<td><STRING></td>
<td><b>N/A</b></td>
<td>specifies the resource manager specific configuration file which must be used to enable correct API communication. <b>NOTE</b>: Only valid with LL based interfaces.</td>
<td>
<tt>RMCFG[base] TYPE=LL CONFIGFILE=/home/loadl/loadl_config</tt>
<p>(The scheduler will utilize the specified file when establishing the resource manager/scheduler interface connection)</p>
</td>
</tr>
<tr>
<td><b>EPORT</b></td>
<td><INTEGER></td>
<td><b>N/A</b></td>
<td>specifies the event port to use to receive resource manager based scheduling events.</td>
<td>
<tt>RMCFG[base] EPORT=15017</tt>
<p>(The scheduler will look for scheduling events from the resource manager host at port 15017)</p>
</td>
</tr>
<tr>
<td><b>MINETIME</b></td>
<td><INTEGER></td>
<td><b>1</b></td>
<td>specifies the minimum time in seconds between processing subsequent scheduling events.</td>
<td>
<tt>RMCFG[base] MINETIME=5</tt>
<p>(The scheduler will batch-process scheduling events which occur less than 5 seconds apart.)</p>
</td>
</tr>
<tr>
<td><a name="nmport" id="nmport"></a> <b>NMPORT</b></td>
<td><INTEGER></td>
<td>(any valid port number)</td>
<td>specifies a non-default RM node manager through which extended node attribute information may be obtained</td>
<td>
<tt>RMCFG[base] NMPORT=13001</tt>
<p>(The scheduler will contact the node manager located on each compute node at port 13001)</p>
</td>
</tr>
<tr>
<td><a name="port" id="port"></a><b>PORT</b></td>
<td><INTEGER></td>
<td>0</td>
<td>specifies the port on which the scheduler should contact the associated resource manager. The value '0' specifies that the resource manager default port should be used.</td>
<td><tt>RMCFG[base] TYPE=PBS HOST=cws PORT=20001</tt><br>
(The scheduler will attempt to contact the PBS server daemon on host cws, port 20001)</td>
</tr>
<tr>
<td><a name="server" id="server"></a><b>SERVER</b></td>
<td><URL></td>
<td>N/A</td>
<td>specifies the resource management service to use. If not specified, the scheduler will locate the resource manager via built-in defaults or, if available, with an information service. <b>NOTE</b>: only available in Maui 3.2.7 and higher.</td>
<td>
<tt>RMCFG[base] server=ll://supercluster.org:9705</tt><br>
<p>(The scheduler will attempt to utilize the Loadleveler scheduling API at the specified location.)</p>
</td>
</tr>
<tr>
<td><b>SUBMITCMD</b></td>
<td><STRING></td>
<td><b>N/A</b></td>
<td>specifies the full pathname to the resource manager job submission client.</td>
<td>
<tt>RMCFG[base] SUBMITCMD=/usr/local/bin/qsub</tt>
<p>(The scheduler will use the specified submit command when launching jobs.)</p>
</td>
</tr>
<tr>
<td><a name="timeout" id="timeout"></a><b>TIMEOUT</b></td>
<td><INTEGER></td>
<td>15</td>
<td>time (in seconds) the scheduler will wait for a response from the resource manager.</td>
<td>
<tt>RMCFG[base] TIMEOUT=30</tt>
<p>(The scheduler will wait 30 seconds to receive a response from the resource manager before timing out and giving up. The scheduler will try again on the next iteration.)</p>
</td>
</tr>
<tr>
<td><a name="type" id="type"></a><b>TYPE</b></td>
<td><RMTYPE>[:<RMSUBTYPE>] where <RMTYPE> is one of the following: <b>LL</b>, <b>LSF</b>, <b>PBS</b>, <b>RMS</b>, <b>SGE</b>, <b>SSS</b>, or <b>WIKI</b> and the optional <RMSUBTYPE> value is one of <b>RMS</b></td>
<td>PBS</td>
<td>specifies type of resource manager to be contacted by the scheduler. NOTE: for <b>TYPE</b> <b>WIKI</b>, <b>AUTHTYPE</b> must be set to <b>CHECKSUM</b> The <RMSUBTYPE> option is currently only used to support Compaq's RMS resource manager in conjunction with PBS. In this case, the value <tt>PBS:RMS</tt> should be specified. <b>NOTE</b>: deprecated in Maui 3.2.7 and higher - use <b>server</b> attribute.</td>
<td>
<tt>RMCFG[clusterA] TYPE=PBS HOST=clusterA PORT=15003</tt><br>
<tt>RMCFG[clusterB] TYPE=PBS HOST=clusterB PORT=15005</tt><br>
<p>(The scheduler will interface to two different PBS resource managers, one located on server clusterA at port 15003 and one located on server clusterB at port 15004)</p>
</td>
</tr>
</table>
<h4>13.2.2.2 Resource Manager Configuration Details</h4>
<p>As with all scheduler parameters, <b>RMCFG</b> follows the syntax described within the <a href="3.4configure.html">Parameters Overview</a>.</p>
<h3>Resource Manager Types</h3>
<p>The parameter <b>RMCFG</b> allows the scheduler to interface to multiple types of resource managers using the <b>TYPE</b> or <b>SERVER</b> attributes. Specifying these attributes, any of the resource managers listed below may be supported. To further assist in configuration, <i>Integration Guides</i> are provided for <a href="pbsintegration.html">PBS</a>, <a href="sgeintegration.html">SGE</a>, and <a href="llintegration.html">Loadleveler</a> systems.</p>
<table border="2">
<tr>
<td><b>TYPE</b></td>
<td><b>Resource Managers</b></td>
<td><b>Details</b></td>
</tr>
<tr>
<td>LL</td>
<td>Loadleveler version 2.x and 3.x</td>
<td>N/A</td>
</tr>
<tr>
<td>LSF</td>
<td>Platform's Load Sharing Facility, version 5.1 and higher</td>
<td>N/A</td>
</tr>
<tr>
<td>PBS</td>
<td>OpenPBS (all versions), TORQUE (all versions), PBSPro (all versions)</td>
<td>N/A</td>
</tr>
<tr>
<td>RMS</td>
<td>RMS (for Quadrics based systems)</td>
<td>for development use only - not production quality</td>
</tr>
<tr>
<td>SGE</td>
<td>Sun Grid Engine version 5.3 and higher</td>
<td>N/A</td>
</tr>
<tr>
<td>SSS</td>
<td>Scalable Systems Software Project version 0.5 and 2.0 and higher</td>
<td>N/A</td>
</tr>
<tr>
<td>WIKI</td>
<td><a href="wiki">Wiki</a> interface specification version 1.0 and higher</td>
<td>used for LRM, YRM, ClubMASK, BProc, and others</td>
</tr>
</table>
<h3>Resource Manager Name</h3>
<p>Maui can support more than one resource manager simultaneously. Consequently, the <b>RMCFG</b> parameter takes an index value, i.e., <tt>RMCFG[clusterA] TYPE=PBS</tt>. This index value essentially <i>names</i> the resource manager (as done by the deprecated parameter <b>RMNAME</b>. The resource manager name is used by the scheduler in diagnostic displays, logging, and in reporting resource consumption to the allocation manager. For most environments, the selection of the resource manager <i>name</i> can be arbitrary.</p>
<h3>Resource Manager Location</h3>
<p>The <b>HOST</b>, <b>PORT</b>, and <b>SERVER</b> attributes can be used to specify how the resource manager should be contacted. For many resource managers (i.e., OpenPBS, PBSPro, Loadleveler, SGE, LSF, etc) the interface correctly establishes contact using default values. These parameters need only to be specified for resource managers such as the WIKI interface (which do not include defaults) or with resources managers which can be configured to run at non-standard locations (such as PBS). In all other cases, the resource manager is automatically located.</p>
<h3>Other Attribute</h3>
<p>The maximum amount of time Maui will wait on a resource manager call can be controlled by the <b><a href="a.fparameters.html#rmcfg">TIMEOUT</a></b> parameter which defaults to 30 seconds. Only rarely will this parameter need to be changed. The <b>AUTHTYPE</b> attribute allows specification of how security over the scheduler/resource manager interface is to be handled. Currently, only the WIKI interface is affected by this parameter.</p>
<p>Another <b>RMCFG</b> attribute is <b><a href="a.fparameters.html#rmcfg">CONFIGFILE</a></b>, which specifies the location of the resource manager's primary config file and is used when detailed resource manager information not available via the scheduling interface is required. It is currently only used with the Loadleveler interface and needs to only be specified when using Maui grid-scheduling capabilities.</p>
<p>Finally, the <a href="a.fparameters.html#rmcfg">NMPORT</a> attribute allows specification of the resource manager's node manager port and is only required when this port has been set to a non-default value. It is currently only used within PBS to allow MOM specific information to be gathered and utilized by Maui.</p>
<h2>13.1.2 Scheduler/Resource Manager Interactions</h2>
<p>In the simplest configuration, Maui interacts with the resource manager using the four primary functions listed below:</p>
<p><b>GETJOBINFO</b></p>
<p>Collect detailed state and requirement information about idle, running, and recently completed jobs.</p>
<p><b>GETNODEINFO</b></p>
<p>Collect detailed state information about idle, busy, and defined nodes.</p>
<p><b>STARTJOB</b></p>
<p>Immediately start a specific job on a particular set of nodes.</p>
<p><b>CANCELJOB</b></p>
<p>Immediately cancel a specific job regardless of job state.</p>
<p>Using these four simple commands, Maui enables nearly its entire suite of scheduling functions. More detailed information about resource manager specific requirements and semantics for each of these commands can be found in the specific resource manager overviews. (LL, PBS, or <a href="wiki">WIKI</a>).</p>
<p>In addition to these base commands, other commands are required to support advanced features such a dynamic job support, suspend/resume, gang scheduling, and scheduler initiated checkpoint restart. More information about these commands will be forthcoming.</p>
<p>Information on creation a new scheduler resource manager interface can be found in the <a href="13.4addingrminterfaces.html">Adding New Resource Manager Interfaces</a> section.</p>
<div class="navIcons bottomIcons">
<a href="index.html"><img src="home.png" title="Home" alt="Home" border="0"></a> <a href="13.0rmandinterfaces.html"><img src="upArrow.png" title="Up" alt="Up" border="0"></a> <a href="13.1rmoverview.html"><img src="prevArrow.png" title="Previous" alt="Previous" border="0"></a> <a href="13.3rmextensions.html"><img src="nextArrow.png" title="Next" alt="Next" border="0"></a>
</div>
</div>
</div>
</div>
<div class="sub-content-btm"></div>
</div>
</body>
</html>