-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update Beam and Plasma to Pure SoA #928
Update Beam and Plasma to Pure SoA #928
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks awesome, thank you Alex!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice, thanks for this PR!! See small comments and questions below. Let's merge tomorrow.
@AlexanderSinn : |
For the plasma we have all the particles in the same ParticleTile so we are more limited by the index type (int/unsigned int). I already ran a simulation with almost 2^31 plasma particles in two containers (500 GB total) on CPU before PureSoA, much more wouldn’t work but also would be impractically slow. The Plasma id is only used to distinguish between valid and invalid particles, so it could be made simpler. Related: #963 For the beam however the id is used for particle tracking diagnostics and a lot of beam particles are cheap performance wise. Currently we initialize everything on one rank and one ParticleTile so we are still index limited and memory limited when first sorting/reordering per box, but this might get fixed and then we will be limited by id(). We already use 2^30 beam particles sometimes. |
…com/AlexanderSinn/hipace into Upadate_Beam_and_Plasma_to_Pure_SoA
If this becomes a problem, would it be possible to change the AMReX behavior to use more bits for Another question: IIRC there's a safeguard in AMReX that would abort if we exceed the range of possible IDs, right? I think I saw this in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thanks for this PR!
It finally works now.
Performance on A100:
New:
Old:
Note: The beam uses a permutation array so that’s why it is slower.
const
isconst
)