You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Preliminary performance analysis of the coarray version of ICAR indicates that the algorithm exhibits significant load imbalance. We believe this is due to lopsided expense of evaluating a few physics kernels in mountain ranges on the grid versus other parts of the grid which do not feature mountains. (Ethan, can you comment on this to verify that I stated the problem correctly?)
Achieving good code performance and scalability requires that we address this load imbalance. One approach would be to partition the grid in an asymmetrical way such that the regions which require expensive kernels to run are distributed evenly among images, rather than being concentrated on a few images. Other approaches may also be possible
The text was updated successfully, but these errors were encountered:
Yes, though I am not sure how imbalanced the load will be in a real situation (in which cloud processes could be active in almost any grid cell in the domain). I would be interested in implementing an asymmetric and adaptive grid, but this seems like a very large challenge with limited payoff (for real world simulations).
Perhaps we need to make real world simulations more of a priority to be able to test this first?
Yes load imbalance is a huge problem for structured-grid codes. It's definitely worth the effort figuring out how adapt the grids to distribute the load - that alone can (and usually does) lead to much larger code speedups than other types of optimization. It may take a lot of work to implement something like that, but I think the payoff would be worth it.
Preliminary performance analysis of the coarray version of ICAR indicates that the algorithm exhibits significant load imbalance. We believe this is due to lopsided expense of evaluating a few physics kernels in mountain ranges on the grid versus other parts of the grid which do not feature mountains. (Ethan, can you comment on this to verify that I stated the problem correctly?)
Achieving good code performance and scalability requires that we address this load imbalance. One approach would be to partition the grid in an asymmetrical way such that the regions which require expensive kernels to run are distributed evenly among images, rather than being concentrated on a few images. Other approaches may also be possible
The text was updated successfully, but these errors were encountered: