Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run NREL 5 MW, 16 turbine case, and launch ASAP #100

Open
pscrozi opened this issue May 8, 2024 · 19 comments
Open

Run NREL 5 MW, 16 turbine case, and launch ASAP #100

pscrozi opened this issue May 8, 2024 · 19 comments
Assignees

Comments

@pscrozi
Copy link
Collaborator

pscrozi commented May 8, 2024

Might not get as far as we want at current pace, but we will keep this running in the background.

@pscrozi
Copy link
Collaborator Author

pscrozi commented May 9, 2024

Shreyas is going to start this today (5/9) or soon thereafter. Lawrence is helping him.

Running into issues with the setup script. Might need to ping Jon for help.

@psakievich
Copy link
Collaborator

No @sbidadi9 in the meeting today. Will follow up on slack.

@psakievich
Copy link
Collaborator

@jrood-nrel ran for 200 timesteps. Still need to update refinements and run for longer times. @lawrenceccheung gave introduction to amr-wind front end to configure new refinement boxes.

@sbidadi9
Copy link

sbidadi9 commented May 14, 2024

@pscrozi @psakievich A quick update:

I was finally able to modify Jon’s script and submit the 16turbine case on Frontier. Its in the queue right now.

Next step is to learn how to restart the simulations. I will reach out to Phil or Jon regarding this.

Btw.. I will be on PTO starting tomorrow for a week.

@psakievich
Copy link
Collaborator

Lawrence will configure smaller mesh refinements and then the group will start trying to run that case as a production run for the milestone.

@pscrozi
Copy link
Collaborator Author

pscrozi commented May 17, 2024

Lawrence: was able to get original version running with mesh as is and executable that Jon provided. Was able to get smaller mesh version of this. Success might depend on the number of nodes we run this. Previously ran on 64 and 384 nodes. Not clear yet what number of nodes will be optimal. Then we'll run it to completion.

@pscrozi
Copy link
Collaborator Author

pscrozi commented May 20, 2024

Lawrence: built much smaller mesh (10x smaller than before), running on 384, but not working on 512. No ability to restart due to OpenFAST. We'll need to reconfigure the way we set up, or naming of output. Testing it out for now, but will run it in production mode later. Need more output modes and get restarts going before production modes.

Phil: could we dump restart files into separate directories? Yes. Could do precursors as well, but not necessary.

@pscrozi
Copy link
Collaborator Author

pscrozi commented May 29, 2024

Lawrence: can run the case with old case, but can't restart OpenFAST checkpoint. Serious problem. Probably have to go into OpenFAST code and figure this out. Ashesh probably wasn't using OpenFAST restarts. Looked at it with Nate on Friday. Tried different directories, but it didn't work for all of the turbines, only the first turbine.

@pscrozi
Copy link
Collaborator Author

pscrozi commented Jun 3, 2024

Lawrence: talked to Nate on Thursday or Friday. Issue with writing out restart files. There's a possible fix for this. Files are now there, but segfaulting on restart. Nate opened up a pull request on OpenFAST to try to get this working.

@pscrozi
Copy link
Collaborator Author

pscrozi commented Jun 5, 2024

Lawrence is running this one with new executable that can do restarts. Need to test this still, but hopefully it works.

@pscrozi
Copy link
Collaborator Author

pscrozi commented Jun 10, 2024

Nate: dev commit from end of April works, but the most recent build that Derek provided does not work due to changes to OpenFAST registry due to breakage with clang. Branch works, which hopefully has restart issue resolved, but some other issue was introduced in the meanwhile. Issues keep getting introduced on the dev branch. Suggest getting build that Nate knows is working and get that going on Frontier. Too many issues going into OpenFAST dev, so might want to pin a version for FSI work.

@pscrozi
Copy link
Collaborator Author

pscrozi commented Jun 12, 2024

Shreyas: started building exawind-manager. I'll verify if the 16 turbine case runs with this latest build and branch.

Ilker was also able to run this.

Nate: does seem to be working for people. On Frontier, transferring over 2 turbine case. Need to test the 16 turbine case restart quickly (can Shreyas do this?) The third restart needs to be verified for the 16 turbine case. Should work as long as it is setup correctly, turb_id needs to be set to a different number for each turbine even if the same Nalu-Wind instance.

@pscrozi
Copy link
Collaborator Author

pscrozi commented Jun 17, 2024

Lawrence: ran this case with OpenFAST build. No one has yet confirmed it restarts, so Lawrence is doing this. Question for Ganesh: when we restart, do we have to worry about blending or ramping? No. Ramping and blending should be fine with restarts. As long as time index is correct, we should be fine. Did we decide on the right number of nodes to run on this case? (No one spoke up.)

Nate: Lawrence and I met and did updates for restarts. We came up with a list for each for restarts. Lawrence will manually try to restart the 16 turbine case, and then try to script it up for future other cases. Will know within a couple hours whether it works.

@lawrenceccheung
Copy link
Collaborator

Restarting with Nate's OpenFAST-dev branch has been confirmed to work on Frontier. Also created a version of the 16 turbine case to run for production here: https://github.com/lawrenceccheung/exawind-cases/tree/FY24Q3/16_turb_abl_fsi_FY24Q3

@pscrozi
Copy link
Collaborator Author

pscrozi commented Jun 24, 2024

Nate's OpenFAST-dev branch runs on Frontier and restarts too! Testing for speed. Running on 64, 192, 384, etc. Levels off at some point. 12 s / timestep is where we're at now. Should we launch now? 12 hour run on Frontier, we could get 11 or 12 seconds. About 1 sec simulation time per hour. 900 seconds would be good to get, which would take about 900 hours on Frontier, with many restart. Not feasible.

Nate: we could try to increase the Courant number further. Ganesh: might not get convergence, particularly with AMR-Wind.

@pscrozi
Copy link
Collaborator Author

pscrozi commented Jun 26, 2024

Lawrence: changed size of AMR-Wind grid (smaller now). Going to run as long as possible. Bundled together with other jobs. It is now running. Using all 64 cores. Not going to touch it since it is running, but wasting just a few cores.

@pscrozi
Copy link
Collaborator Author

pscrozi commented Jul 17, 2024

It is running, but there is no controller on.

@pscrozi
Copy link
Collaborator Author

pscrozi commented Jul 22, 2024

At 80 seconds now.

@pscrozi
Copy link
Collaborator Author

pscrozi commented Aug 12, 2024

now at 133 seconds for this FSI case

Haven't put these back on the queue because we're only getting about 1 second per hour of wall clock.

May not make sense to run these further out since FSI vs ALM only are quite similar.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants