Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GASNet's max threads per process silently limiting Chapel's maxTaskPar #26379

Open
bradcray opened this issue Dec 9, 2024 · 3 comments
Open

Comments

@bradcray
Copy link
Member

bradcray commented Dec 9, 2024

A user running on a high core-count node was surprised to find that they only got a value of here.maxTaskPar of 255 despite having more cores than that. Digging into the source, they found that it was due to the runtime deferring to the comm layer (in this case, GASNet) in terms of determining how many threads to use. After rebuilding with CHPL_GASNET_CFG_OPTIONS="--with-max-pthreads-per-node=2000" (a generous upper-bound), they were able to use all the cores on the node.

This issue is filed to ask what we could do about this.

  • at a minimum, it seems that we should warn when this happens, with notes or pointers to documentation about how to increase the limit
  • there was also a question from the user about whether this default should be raised in GASNet (or Chapel's configuration of GASNet) given modern processor design and core count increases
  • maybe there's even something more significant that could be done to reduce the need for a configuration-time limit?
@bonachea
Copy link
Contributor

bonachea commented Dec 9, 2024

Hi @bradcray - this is a great question!

GASNet effectively defaults to a configure-time hard limit of --with-max-pthreads-per-node=256 (which the Chapel runtime converts into a max 255 tasks). This is based mostly on historically lower core-per-node architectures and multi-PPN clients that seldom ran more than 256 communicating OS-level threads per process (Chapel locale). Internally GASNet special-cases smaller values to statically allocate a thread data table (removing one level of indirection) and reduce the storage of internal thread identifiers. However my guess is these tweaks probably have a vanishingly small impact on actual runtime performance of most programs.

As such it probably makes the most sense for Chapel to simply default to configuring GASNet using --with-max-pthreads-per-node=65535. With that setting for the hard limit, GASNet will use 16-bit thread identifiers and a runtime default thread soft limit of 1024 (used to size GASNet's internal thread table allocated at startup). Then at runtime the user can (for example) export GASNET_MAX_THREADS=2048 to get a higher limit for a particular run.

Should we pursue that approach?

For completeness, users of past releases can get the effect described above by setting export CHPL_GASNET_MORE_CFG_OPTIONS=--with-max-pthreads-per-node=65535 when building Chapel.

CC: @PHHargrove

@bradcray
Copy link
Member Author

bradcray commented Dec 21, 2024

Hi @bonachea — Very belatedly, thanks for the information and ideas here. I definitely like the idea of raising the cap on the number of threads one way or another, and if you don't have any misgivings about the approach you've proposed here (which sounds like it's the case), that sounds like a good way to go. I also like coupling your proposal with a warning to our runtime code if/when a user bumps into the 1024 limit to note it and point to GASNET_MAX_THREADS as a way to exceed it.

[edit: I should add that I don't think this is a massive priority, though it does sound like very low-hanging fruit...]

@bradcray
Copy link
Member Author

though it does sound like very low-hanging fruit...

Low enough that I decided to close out the day by throwing a draft PR together sketching out what I think Dan was proposing above with my warning added as well: #26448

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants