forked from rhc54/pmix-standard
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Chap_Terms.tex
160 lines (111 loc) · 14.3 KB
/
Chap_Terms.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Chapter: Terms and Conventions
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\chapter{PMIx Terms and Conventions}
\label{chap:terms}
In this chapter we describe some common terms and conventions used throughout
this document. The \ac{PMIx} Standard has adopted the widespread use of
key-value \textit{attributes} to add flexibility to the functionality expressed
in the existing \acp{API}. Accordingly, the \ac{ASC} has chosen to require that
the definition of each standard \ac{API} include the passing of an array of
attributes. These provide a means of customizing the behavior of the \ac{API}
as future needs emerge without having to alter or create new variants of it. In
addition, attributes provide a mechanism by which researchers can easily
explore new approaches to a given operation without having to modify the
\ac{API} itself.
In an effort to maintain long-term backward compatibility, \ac{PMIx} does not include large numbers of \acp{API} that each focus on a narrow scope of functionality, but instead relies on the definition of fewer generic \acp{API} that include arrays of key-value attributes for ``tuning'' the function's behavior. Thus, modifications to the \ac{PMIx} standard primarily consist of the definition of new attributes along with a description of the \acp{API} to which they relate and the expected behavior when used with those \acp{API}.
The following terminology is used throughout this document:
\begin{itemize}
\item \declareterm{session} refers to a pool of resources with a unique identifier (a.k.a., the \emph{session ID}) assigned by the \ac{WLM} that has been reserved for one or more users. Historically, \ac{HPC} sessions have consisted of a static allocation of resources - e.g., a block of nodes assigned to a user in response to a specific request and managed as a unified collection. However, this is changing in response to the growing use of dynamic programming models that require on-the-fly allocation and release of system resources. Accordingly, the term \emph{session} in this document refers to a potentially dynamic entity, perhaps comprised of resources accumulated as a result of multiple allocation requests that are managed as a single unit by the \ac{WLM}.
\item \declareterm{job} refers to a set of one or more \emph{applications} executed as a single invocation by the user within a session with a unique identifier (a.k.a, the \emph{job ID}) assigned by the \ac{RM} or launcher. For example, the command line ``\textit{mpiexec -n 1 app1 : -n 2 app2}'' generates a single \ac{MPMD} job containing two applications. A user may execute multiple \emph{jobs} within a given session, either sequentially or in parallel.
\item \declareterm{namespace} refers to a character string value assigned by the \ac{RM} or launcher (e.g., \code{mpiexec}) to a \textit{job}. All \textit{applications} executed as part of that \textit{job} share the same \emph{namespace}. The \emph{namespace} assigned to each \emph{job} must be unique within the scope of the governing \ac{RM} and often is implemented as a string representation of a numerical job ID. The \emph{namespace} and \emph{job} terms will be used interchangeably throughout the document.
\item \declareterm{application} refers to a single executable (binary, script, etc.) member of a \emph{job}.
\item \declareterm{process} refers to an operating system process, also commonly referred to as a \emph{heavyweight} process. A process is often comprised of multiple \emph{lightweight threads}, commonly known as simply \declaretermAlt{threads}{thread}.
\item \declaretermAlt{client}{clients} refers to a process that was registered with the \ac{PMIx} server prior to being started, and connects to that \ac{PMIx} server via \refapi{PMIx_Init} using its assigned namespace and rank with the information required to connect to that server being provided to the process at time of start of execution.
\item \declaretermAlt{clone}{clones} refers to a process that was directly started by a \ac{PMIx} client (e.g., using \emph{fork/exec}) and calls \refapi{PMIx_Init}, thus connecting to its local \ac{PMIx} server using the same namespace and rank as its parent process.
\item \declaretermAlt{slot}{slots} refers to an allocated entry for a process. \acp{WLM} frequently allocate entire nodes to a \emph{session}, but can also be configured to define the maximum number of processes that can simultaneously be executed on each node. This often corresponds to the number of hardware \acp{PU} (typically cores, but can also be defined as hardware threads) on the node. However, the correlation between hardware \acp{PU} and slot allocations strictly depends upon system configuration.
\item \declareterm{rank} refers to the numerical location (starting from zero) of a process within the defined scope. Thus, \emph{job rank} is the rank of a process within its \emph{job} and is synonymous with its unqualified \emph{rank}, while \emph{application rank} is the rank of that process within its \emph{application}.
\item \declaretermAlt{peer}{peers} refers to another process within the same \refterm{job}.
\item \declaretermAlt{workflow}{workflows} refers to an orchestrated execution plan frequently involving multiple \emph{jobs} carried out under the control of a \emph{workflow manager} process. An example workflow might first execute a computational job to generate the flow of liquid through a complex cavity, followed by a visualization job that takes the output of the first job as its input to produce an image output.
\item \declareterm{scheduler} refers to the component of the \ac{SMS} responsible for scheduling of resource allocations. This is also generally referred to as the \emph{system workflow manager} - for the purposes of this document, the \emph{WLM} acronym will be used interchangeably to refer to the scheduler.
\item \declaretermAlt{resource manager}{RM} is used in a generic sense to represent the subsystem that will host the \ac{PMIx} server library. This could be a vendor-supplied resource manager or a third-party agent such as a programming model's runtime library.
\item \declareterm{host environment} is used interchangeably with \emph{resource manager} to refer to the process hosting the \ac{PMIx} server library.
\item \declareterm{node} refers to a single operating system instance. Note that this may encompass one or more physical objects.
\item \declareterm{package} refers to a single object that is either soldered or connected to a printed circuit board via a mechanical socket. Packages may contain multiple chips that include (but are not limited to) processing units, memory, and peripheral interfaces.
\item \declareterm{processing unit}, or \emph{PU}, is the electronic circuitry within a computer that executes instructions. Depending upon architecture and configuration settings, it may consist of a single hardware thread or multiple hardware threads collectively organized as a \emph{core}.
\item \declaretermAlt{fabric}{fabrics} is used in a generic sense to refer to the networks within the system regardless of speed or protocol. Any use of the term \emph{network} in the document should be considered interchangeable with \emph{fabric}.
\item \declaretermAlt{fabric plane}{fabric planes} refers to a collection of devices (\acp{NIC}) and switches in a common logical or physical configuration. Fabric planes are often implemented in \ac{HPC} clusters as separate overlay or physical networks controlled by a dedicated fabric manager.
\item \declareterm{attribute} refers to a key-value pair comprised of a string key (represented by a \refstruct{pmix_key_t} structure) and an associated value containing a \ac{PMIx} data type (e.g., boolean, integer, or a more complex \ac{PMIx} structure). Attributes are used both as directives when passed as qualifiers to \acp{API} (e.g., in a \refstruct{pmix_info_t} array), and to identify the contents of information (e.g., to specify that the contents of the corresponding \refstruct{pmix_value_t} in a \refstruct{pmix_info_t} represent the \refattr{PMIX_UNIV_SIZE}).
\item \declareterm{key} refers to the string component of a defined \emph{attribute}. The \ac{PMIx} Standard will often refer to passing of a \emph{key} to an \ac{API} (e.g., to the \refapi{PMIx_Query_info} or \refapi{PMIx_Get} \acp{API}) as a means of identifying requested information. In this context, the \emph{data type} specified in the \emph{attribute's} definition indicates the data type the caller should expect to receive in return. Note that not all \emph{attributes} can be used as \emph{keys} as some have specific uses solely as \ac{API} qualifiers.
\item \declareterm{instant on} refers to a \ac{PMIx} concept defined as: "All information required for setup and communication (including the address vector of endpoints for every process) is available to each process at start of execution"
\end{itemize}
The following sections provide an overview of the conventions used throughout the \ac{PMIx} Standard document.
%%%%%%%%%%%
\section{Notational Conventions}
Some sections of this document describe programming language specific examples or \acp{API}.
Text that applies only to programs for which the base language is C is shown as follows:
\cspecificstart
C specific text...
\begin{codepar}
int foo = 42;
\end{codepar}
\cspecificend
Some text is for information only, and is not part of the normative specification.
These take several forms, described in their examples below:
\notestart
\noteheader
General text...
\noteend
\rationalestart
Throughout this document, the rationale for the design choices made in the interface specification is set off in this section.
Some readers may wish to skip these sections, while readers interested in interface design may want to read them carefully.
\rationaleend
\adviceuserstart
Throughout this document, material aimed at users and that illustrates usage is set off in this section.
Some readers may wish to skip these sections, while readers interested in programming with the \ac{PMIx} \ac{API} may want to read them carefully.
\adviceuserend
\adviceimplstart
Throughout this document, material that is primarily commentary to \ac{PMIx} library implementers is set off in this section.
Some readers may wish to skip these sections, while readers interested in \ac{PMIx} implementations may want to read them carefully.
\adviceimplend
\advicermstart
Throughout this document, material that is primarily commentary aimed at host environments (e.g., \acp{RM} and \acp{RTE}) providing support for the \ac{PMIx} server library is set off in this section.
Some readers may wish to skip these sections, while readers interested in integrating \ac{PMIx} servers into their environment may want to read them carefully.
\advicermend
Attributes added in this version of the standard are shown in \textit{\textbf{\color{magenta}magenta}} to distinguish them from those defined in prior versions, which are shown in \textit{\textbf{black}}. Deprecated attributes are shown in \textit{\textbf{\color{green!80!black}green}} and may be removed in a future version of the standard.
%%%%%%%%%%%
\section{Semantics}
The following terms will be taken to mean:
\begin{itemize}
\item \emph{shall}, \emph{must} and \emph{will} indicate that the specified behavior is \emph{required} of all conforming implementations
\item \emph{should} and \emph{may} indicate behaviors that a complete implementation would include, but are not required of all conforming implementations
\end{itemize}
%%%%%%%%%%%
\section{Naming Conventions}
The \ac{PMIx} standard has adopted the following conventions:
\begin{itemize}
\item \ac{PMIx} constants and attributes are prefixed with \textbf{\code{PMIX}}.
\item Structures and type definitions are prefixed with \code{pmix}.
\item Underscores are used to separate words in a function or variable name.
\item Lowercase letters are used in \ac{PMIx} client \acp{API} except for the \ac{PMIx} prefix (noted below) and the first letter of the word following it.
For example, \refapi{PMIx_Get_version}.
\item \ac{PMIx} server and tool \acp{API} are all lower case letters following the prefix - e.g., \refapi{PMIx_server_register_nspace}.
\item The \code{PMIx_} prefix is used to denote functions.
\item The \code{pmix_} prefix is used to denote function pointer and type definitions.
\end{itemize}
Users should not use the \textbf{\code{"PMIX"}}, \textbf{\code{"PMIx"}}, or \textbf{\code{"pmix"}} prefixes in their applications or libraries so as to avoid symbol conflicts with current and later versions of the \ac{PMIx} Standard.
%%%%%%%%%%%
\section{Procedure Conventions}
While the current \acp{API} are based on the C programming language, it is not the intent of the \ac{PMIx} Standard to preclude the use of other languages.
Accordingly, the procedure specifications in the \ac{PMIx} Standard are written in a language-independent syntax with the arguments marked as IN, OUT, or INOUT.
The meanings of these are:
\begin{itemize}
\item IN:
The call may use the input value but does not update the argument from the perspective of the caller at any time during the calls execution,
\item OUT:
The call may update the argument but does not use its input value
\item INOUT:
The call may both use and update the argument.
\end{itemize}
Many \ac{PMIx} interfaces, particularly nonblocking interfaces, use a \code{(void*)} callback data object passed to the function that is then passed to the associated callback. On the client side, the callback data object is an opaque, client-provided context that the client can pass to a non-blocking call. When the nonblocking call completes, the callback data object is passed back to the client without modification by the \ac{PMIx} library, thus allowing the client to associate a context with that callback. This is useful if there are many outstanding nonblocking calls.
A similar model is used for the server module functions (see \ref{server:module_fns}). In this case, the \ac{PMIx} library is making an upcall into its host via the \ac{PMIx} server module callback function and passing a specific callback function pointer and callback data object. The \ac{PMIx} library expects the host to call the cbfunc with the necessary arguments and pass back the original callback data obect upon completing the operation. This gives the server-side \ac{PMIx} library the ability to associate a context with the call back (since multiple operations may be outstanding). The host has no visibility into the contents of the callback data object object, nor is permitted to alter it in any way.