Update Beam and Plasma to Pure SoA #928

AlexanderSinn · 2023-05-02T16:33:24Z

It finally works now.

Performance on A100:

amr.n_cell = 2047 2047 2000
beam.num_particles = 100000000
elec.ppc = 4 4
ions.ppc = 4 4

New:

TinyProfiler total time across processes [min...avg...max]: 200.5 ... 200.5 ... 200.5

--------------------------------------------------------------------------------------------------
Name                                               NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
--------------------------------------------------------------------------------------------------
ExplicitDeposition()                                 4000      58.13      58.13      58.13  28.99% <---
AdvancePlasmaParticles()                             4000      57.11      57.11      57.11  28.48% <---
DepositCurrent_PlasmaParticleContainer()             4000      54.09      54.09      54.09  26.98% <---
hpmg::MultiGrid::solve1()                            2000      19.15      19.15      19.15   9.55%
AnyDST::Execute()                                   12000      5.006      5.006      5.006   2.50%
sortBeamParticlesByBox()                                2      2.179      2.179      2.179   1.09%
FFTPoissonSolverDirichlet::SolvePoissonEquation()    6000     0.8317     0.8317     0.8317   0.41%
Hipace::SolveOneSlice()                              2000     0.6687     0.6687     0.6687   0.33%
Fields::ShiftSlices()                                2000     0.5452     0.5452     0.5452   0.27%
AdvanceBeamParticlesSlice()                          2000     0.5168     0.5168     0.5168   0.26% <---
Hipace::InitializeSxSyWithBeam()                     2000     0.3997     0.3997     0.3997   0.20%
Fields::LinCombination()                             4000     0.3667     0.3667     0.3667   0.18%
PlasmaParticleContainer::InitParticles                  2       0.27       0.27       0.27   0.13%
DepositCurrentSlice_BeamParticleContainer()          4000     0.2205     0.2205     0.2205   0.11% <---
Fields::SolveExmByAndEypBx()                         2000     0.2008     0.2008     0.2008   0.10%
BeamParticleContainer::InitParticles()                  1     0.1952     0.1952     0.1952   0.10%
Fields::Multiply()                                   2000     0.1198     0.1198     0.1198   0.06%

Old:

TinyProfiler total time across processes [min...avg...max]: 202.9 ... 202.9 ... 202.9

--------------------------------------------------------------------------------------------------
Name                                               NCalls  Excl. Min  Excl. Avg  Excl. Max   Max %
--------------------------------------------------------------------------------------------------
AdvancePlasmaParticles()                             4000      59.46      59.46      59.46  29.31% <---
ExplicitDeposition()                                 4000      58.33      58.33      58.33  28.75% <---
DepositCurrent_PlasmaParticleContainer()             4000      54.66      54.66      54.66  26.94% <---
hpmg::MultiGrid::solve1()                            2000      19.19      19.19      19.19   9.46%
AnyDST::Execute()                                   12000      5.032      5.032      5.032   2.48%
sortBeamParticlesByBox()                                2       1.65       1.65       1.65   0.81%
FFTPoissonSolverDirichlet::SolvePoissonEquation()    6000      0.832      0.832      0.832   0.41%
Hipace::SolveOneSlice()                              2000     0.6697     0.6697     0.6697   0.33%
Fields::ShiftSlices()                                2000     0.5455     0.5455     0.5455   0.27%
Hipace::InitializeSxSyWithBeam()                     2000     0.3998     0.3998     0.3998   0.20%
Fields::LinCombination()                             4000     0.3665     0.3665     0.3665   0.18%
AdvanceBeamParticlesSlice()                          2000      0.297      0.297      0.297   0.15% <---
PlasmaParticleContainer::InitParticles                  2     0.2808     0.2808     0.2808   0.14%
Fields::SolveExmByAndEypBx()                         2000      0.203      0.203      0.203   0.10%
BeamParticleContainer::InitParticles()                  1     0.1796     0.1796     0.1796   0.09%
DepositCurrentSlice_BeamParticleContainer()          4000     0.1766     0.1766     0.1766   0.09% <---

Note: The beam uses a permutation array so that’s why it is slower.

Small enough (< few 100s of lines), otherwise it should probably be split into smaller PRs
Tested (describe the tests in the PR description)
Runs on GPU (basic: the code compiles and run well with the new module)
Contains an automated test (checksum and/or comparison with theory)
Documented: all elements (classes and their members, functions, namespaces, etc.) are documented
Constified (All that can be const is const)
Code is clean (no unwanted comments, )
Style and code conventions are respected at the bottom of https://github.com/Hi-PACE/hipace
Proper label and GitHub project, if applicable

ax3l

Looks awesome, thank you Alex!

src/particles/beam/BeamParticleContainer.H

src/particles/beam/BeamParticleContainerInit.cpp

MaxThevenet

Nice, thanks for this PR!! See small comments and questions below. Let's merge tomorrow.

src/particles/plasma/PlasmaParticleContainer.cpp

src/particles/plasma/PlasmaParticleContainerInit.cpp

src/particles/pusher/GetAndSetPosition.H

ax3l · 2023-05-31T20:44:12Z

@AlexanderSinn :
@atmyers and I were wondering if the new split particle ID cause issues for you, running over the generation of 2billion particles on a single rank?

AlexanderSinn · 2023-05-31T21:34:51Z

For the plasma we have all the particles in the same ParticleTile so we are more limited by the index type (int/unsigned int). I already ran a simulation with almost 2^31 plasma particles in two containers (500 GB total) on CPU before PureSoA, much more wouldn’t work but also would be impractically slow. The Plasma id is only used to distinguish between valid and invalid particles, so it could be made simpler. Related: #963

For the beam however the id is used for particle tracking diagnostics and a lot of beam particles are cheap performance wise. Currently we initialize everything on one rank and one ParticleTile so we are still index limited and memory limited when first sorting/reordering per box, but this might get fixed and then we will be limited by id(). We already use 2^30 beam particles sometimes.

…com/AlexanderSinn/hipace into Upadate_Beam_and_Plasma_to_Pure_SoA

MaxThevenet · 2023-06-01T04:58:42Z

If this becomes a problem, would it be possible to change the AMReX behavior to use more bits for id and less for cpu, maybe as a compile-time option? For HiPACE++, we don't do much with cpu, and just a few bits would be sufficient. (we don't need it yet, this is just for information.)

Another question: IIRC there's a safeguard in AMReX that would abort if we exceed the range of possible IDs, right? I think I saw this in NextID. Otherwise we should put one in HiPACE++.

MaxThevenet

Great, thanks for this PR!

Upadate Beam and Plasma to Pure SoA

a58ecf3

AlexanderSinn added component: plasma About the plasma species component: beam About the beam species labels May 2, 2023

ax3l requested review from ax3l and atmyers May 2, 2023 16:51

MaxThevenet changed the title ~~[WIP] Upadate Beam and Plasma to Pure SoA~~ [WIP] Update Beam and Plasma to Pure SoA May 3, 2023

AlexanderSinn and others added 16 commits May 9, 2023 16:59

Merge branch 'development' into Upadate_Beam_and_Plasma_to_Pure_SoA

fbb7e81

Fix merge

f733b79

more merge fixes

48efdb6

more merge fixes

11aa7bd

update binning

53d84fc

fix xy

034deb8

merge dev

7bb6217

fix merge

db57e25

fix bug

133c9fe

ExplicitDeposition use pdt

1d6911c

beam cleaning

5906cd7

Merge branch 'development' into Upadate_Beam_and_Plasma_to_Pure_SoA

5ec59e9

fix merge

d93755c

cleaning

7dd223b

better psize

d559428

initialize cpu

5d99771

AlexanderSinn changed the title ~~[WIP] Update Beam and Plasma to Pure SoA~~ Update Beam and Plasma to Pure SoA May 25, 2023

AlexanderSinn requested review from MaxThevenet and SeverinDiederichs May 26, 2023 11:28

Small optimization

2cdadbf

ax3l reviewed May 30, 2023

View reviewed changes

src/particles/beam/BeamParticleContainer.H Outdated Show resolved Hide resolved

src/particles/beam/BeamParticleContainerInit.cpp Outdated Show resolved Hide resolved

AlexanderSinn and others added 2 commits May 30, 2023 19:49

use real_nattribs

3677fda

Merge branch 'development' into Upadate_Beam_and_Plasma_to_Pure_SoA

603613e

AlexanderSinn mentioned this pull request May 31, 2023

Add [z], [z^2], [uz], [uz^2] and [z*uz] to beam insitu diagnostics #964

Merged

9 tasks

MaxThevenet reviewed May 31, 2023

View reviewed changes

AlexanderSinn added 4 commits June 1, 2023 01:07

Fix format and remove getpos setpos

81dc53b

Merge branch 'Upadate_Beam_and_Plasma_to_Pure_SoA' of https://github.…

d5c590f

…com/AlexanderSinn/hipace into Upadate_Beam_and_Plasma_to_Pure_SoA

merge

bf035ff

remove getPosition

e060a55

MaxThevenet self-requested a review June 1, 2023 05:04

MaxThevenet approved these changes Jun 1, 2023

View reviewed changes

MaxThevenet merged commit 5dd5d23 into Hi-PACE:development Jun 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Beam and Plasma to Pure SoA #928

Update Beam and Plasma to Pure SoA #928

AlexanderSinn commented May 2, 2023 •

edited

Loading

ax3l left a comment

MaxThevenet left a comment

ax3l commented May 31, 2023

AlexanderSinn commented May 31, 2023

MaxThevenet commented Jun 1, 2023

MaxThevenet left a comment

Update Beam and Plasma to Pure SoA #928

Update Beam and Plasma to Pure SoA #928

Conversation

AlexanderSinn commented May 2, 2023 • edited Loading

ax3l left a comment

Choose a reason for hiding this comment

MaxThevenet left a comment

Choose a reason for hiding this comment

ax3l commented May 31, 2023

AlexanderSinn commented May 31, 2023

MaxThevenet commented Jun 1, 2023

MaxThevenet left a comment

Choose a reason for hiding this comment

AlexanderSinn commented May 2, 2023 •

edited

Loading