Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out how to do multi-node jobs and to compute the load of multi-node systems #6

Closed
6 tasks done
lars-t-hansen opened this issue Jul 28, 2023 · 6 comments
Closed
6 tasks done
Assignees

Comments

@lars-t-hansen
Copy link
Collaborator

lars-t-hansen commented Jul 28, 2023

For the ML and light-HPC systems there's at most one node per job, but this is not true on the bigger systems - in that case, jobs can span multiple nodes. The sonar records will have the same job ID - these are SLURM jobs - so we'll collect records properly into jobs. But there's the matter of perhaps filtering and printing the node names sensibly, as well as computing and presenting the cross-node load. For system-relative load data we must also (in some way) use the capabilities of multiple systems to compute proper values, it's not enough to sum things and hope for the best.

Evolving task list:

@lars-t-hansen lars-t-hansen self-assigned this Jul 28, 2023
@Sabryr
Copy link
Contributor

Sabryr commented Jul 30, 2023

  • This is a good observation. For multi-node jobs one more important think is whether the recruited nodes were used optimally.

    • e.g 1. a user submits a job asking 4 nodes, but the jobs runs only on one node and the other 3 idle throughout .
    • e.g 2. A user submits a jobs asking for 4 nodes, but only two nodes were active at any given time.
  • There are two network connections between each node. One is the type InfiniBand . If there were lot of data transfers between nodes, InfinibaBand should have been used. For cases where just the variable state updated, this is not important.

@lars-t-hansen
Copy link
Collaborator Author

I think we already have what we need to compute cross-node utilization (your first point) but sonar does not currently capture any data about communication (the second point), be it volume or topography. It is a sampling profiler and its only means of sampling is to probe system tables (via /proc and other pseudo filesystems, either directly or via ps; or via programs that have access to other statistics by probing hardware directly, such as nvidia-smi). If those types of sampling interfaces already can deliver information about communication then that's great. Otherwise we'll have to add additional system components that can collect this information, somehow. I'll file the necessary bugs on sonar to track this.

(I'll add communication volume to the set of use cases.)

@lars-t-hansen
Copy link
Collaborator Author

lars-t-hansen commented Aug 3, 2023

Technical quirk: with the synthesized job IDs (as on the ML nodes) there's a risk that the same PID is being used as the job ID on two different machines in an overlapping timeframe, yet these are two different jobs. It's important for sonalyze not to be confused about this. I think that in the case where we're interacting with a batch queue, there will be a command line argument to sonalyze to identify the system as such, eg to point to a data directory. The default, in the absence of such a switch, should be to treat hosts as independent.

In presenting a query that runs against the logs of multiple hosts, the same job ID may thus be shown multiple times in a listing, but it is always relative to the host. The consumer of the data must be aware of this.

(Edit: turns out that the current sonarlog code erroneously merges records from jobs on different hosts.)

@lars-t-hansen
Copy link
Collaborator Author

I think we already have what we need to compute cross-node utilization (your first point)

This turns out not to be true, due to a sonar bug, see #27 and #28.

@lars-t-hansen
Copy link
Collaborator Author

This is pretty much done now, I'm just doing some final testing and will then merge. I'll cut NordicHPC/sonar#67 loose, it doesn't need to block this bug, it's something that can come later. There are other mop-up issues too, like #54, but again, not really blocking us here.

@lars-t-hansen
Copy link
Collaborator Author

Fixed, for now. We'll file additional things as followup bugs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants