You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A user running on a high core-count node was surprised to find that they only got a value of here.maxTaskPar of 255 despite having more cores than that. Digging into the source, they found that it was due to the runtime deferring to the comm layer (in this case, GASNet) in terms of determining how many threads to use. After rebuilding with CHPL_GASNET_CFG_OPTIONS="--with-max-pthreads-per-node=2000" (a generous upper-bound), they were able to use all the cores on the node.
This issue is filed to ask what we could do about this.
at a minimum, it seems that we should warn when this happens, with notes or pointers to documentation about how to increase the limit
there was also a question from the user about whether this default should be raised in GASNet (or Chapel's configuration of GASNet) given modern processor design and core count increases
maybe there's even something more significant that could be done to reduce the need for a configuration-time limit?
The text was updated successfully, but these errors were encountered:
GASNet effectively defaults to a configure-time hard limit of --with-max-pthreads-per-node=256 (which the Chapel runtime converts into a max 255 tasks). This is based mostly on historically lower core-per-node architectures and multi-PPN clients that seldom ran more than 256 communicating OS-level threads per process (Chapel locale). Internally GASNet special-cases smaller values to statically allocate a thread data table (removing one level of indirection) and reduce the storage of internal thread identifiers. However my guess is these tweaks probably have a vanishingly small impact on actual runtime performance of most programs.
As such it probably makes the most sense for Chapel to simply default to configuring GASNet using --with-max-pthreads-per-node=65535. With that setting for the hard limit, GASNet will use 16-bit thread identifiers and a runtime default thread soft limit of 1024 (used to size GASNet's internal thread table allocated at startup). Then at runtime the user can (for example) export GASNET_MAX_THREADS=2048 to get a higher limit for a particular run.
Should we pursue that approach?
For completeness, users of past releases can get the effect described above by setting export CHPL_GASNET_MORE_CFG_OPTIONS=--with-max-pthreads-per-node=65535 when building Chapel.
Hi @bonachea — Very belatedly, thanks for the information and ideas here. I definitely like the idea of raising the cap on the number of threads one way or another, and if you don't have any misgivings about the approach you've proposed here (which sounds like it's the case), that sounds like a good way to go. I also like coupling your proposal with a warning to our runtime code if/when a user bumps into the 1024 limit to note it and point to GASNET_MAX_THREADS as a way to exceed it.
[edit: I should add that I don't think this is a massive priority, though it does sound like very low-hanging fruit...]
though it does sound like very low-hanging fruit...
Low enough that I decided to close out the day by throwing a draft PR together sketching out what I think Dan was proposing above with my warning added as well: #26448
A user running on a high core-count node was surprised to find that they only got a value of
here.maxTaskPar
of 255 despite having more cores than that. Digging into the source, they found that it was due to the runtime deferring to the comm layer (in this case, GASNet) in terms of determining how many threads to use. After rebuilding withCHPL_GASNET_CFG_OPTIONS="--with-max-pthreads-per-node=2000"
(a generous upper-bound), they were able to use all the cores on the node.This issue is filed to ask what we could do about this.
The text was updated successfully, but these errors were encountered: