-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute cross-node load and utilization #24
Comments
An interesting question here (cf #6 (comment)) is what "utilization" means in a multi-node system. Say the program requests 4 GPUs and runs flat-out on one of them and ignores the other three. Utilization seen as a whole is 25%. But this is not the only view. If we take that number at face value we may start to look for eg communication bottlenecks, or stalls for I/O, while the real problem is that there's zero scalability. The user may believe that by doubling the number of cores the program might speed up some, but this is not true: the program will never run faster, except on a faster CPU. In other words, the analysis of utilization in a multi-node system may be more complicated than aggregating data across the nodes. |
There are probably several overlapping use cases here. For the use cases For other use cases such as |
The key to computing cross-node values for the aggregated use case in the previous comment is to synthesize an event stream that takes into account the activity on all nodes, and then process that in the normal manner. See also #43. |
Cross-node utilization (first paragraph in #24 (comment)) was implemented by fa366cc, this adds a switch --batch that says to merge job numbers across nodes and compute appropriate aggregates. |
It's not clear to me that the by-node load data is not adequately presented (second paragraph in #24 (comment)) using the --job= switch. Consider these data from a Fox job (command line abbreviated for clarity):
Combined with summary data (same commandline along with -b):
we have a pretty good idea about what's going on here. As we add monitoring capabilities for eg communication, we can see system utilization and perform a simple scalability analysis pretty broadly. For example, there is an interesting question about the above job (which takes about 3 minutes), because it should be 100% CPU bound with a quick communication phase at the beginning and end. Is the 85% average CPU utilization due to system effects or Let's explore (output is abbreviated but all nodes look the same):
That's a clue - things are slow during startup, the job takes a minute to get going. More investigation is needed, but it's not clear that sonalyze needs much more specialized functionality. (Arguably the first "0" record is really a sentinel and pollutes the averages. I'll file a bug.) |
Currently the load and utilization computations relate only to a single host, but for multi-node jobs we must take the into account the capabilities of all the nodes involved.
Depends on:
The text was updated successfully, but these errors were encountered: