Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The docker images with the compiled CROCO binaries should be configurable #33

Open
zachsa opened this issue Apr 12, 2023 · 11 comments · Fixed by #36
Open

The docker images with the compiled CROCO binaries should be configurable #33

zachsa opened this issue Apr 12, 2023 · 11 comments · Fixed by #36
Labels
enhancement New feature or request

Comments

@zachsa
Copy link
Collaborator

zachsa commented Apr 12, 2023

Algoa Bay Model
Here is the Dockerfile for compiling the Algoa Bay CROCO image: https://github.com/SAEON/somisana/blob/stable/models/algoa-bay-forecast/croco.Dockerfile

Some of the settings in param.h , cppdefs.h, and the launch script (run_croco.bash) should be configurable. To achieve this:

  • The Dockerfile needs to be updated to accept build args
  • Then after copying the param.h, cppdefs.h, and run_croco.bash files into the Docker build context (see these lines), those files need to be updated depending on what build args are passed (using string replacement).

What about if the defaults were suitable for local computer execution @GilesFearon? Then in production the docker build gets bespoke commands. I think these commands have to be passed at build time, as once the image is built it contains the compiled Fortran executable.

So far as I can remember from the CROCO workshop, there are two places to update when configuring how many cores to use (please check these @GilesFearon / @mattcarr03):

Are there any other easily configurable params?

False Bay Model
Same as above, just in the False Bay folder

@GilesFearon
Copy link
Collaborator

I think it would be really cool to be able run the system on a regular 4 core laptop (or a 32 core desktop for that matter). It would make it a lot more generic.

Yup, as far as I can remember (it's been so long since I ran CROCO!), param.h and run_croco.bash would be the only places where you need to configure the number of cores. I guess we could just define $NP_X and $NP_ETA as environment variables (then you don't even need to do string replacement right?).

My understanding is that to run the model locally, we currently need to run the docker image with the CROCO binary already baked in from the latest iteration of the forecast system? So I'm still not sure how it becomes easily configurable from the local users point of view if we keep it like that? Unless the user has to re-build their own docker image, which is not ideal.

Would it be an idea to not compile the code when creating the docker image (all the other steps would remain), but we do that as a first step in run_croco.bash? We would need one line to run jobcomp and another to move the executable?

I can't think of other params which we would want to make easily configurable. Any changes to cppdefs.h would constitute a new configuration, which I think would mean you're changing the source code of the system

@zachsa
Copy link
Collaborator Author

zachsa commented Apr 13, 2023

Okay. Compiling at runtime makes sense.

So, two environment variables:

  • NP_X
  • NP_ETA

And then NUN_CORES = NP_X * NP_ETA

zachsa added a commit that referenced this issue Apr 17, 2023
…ETA and NP_XI, and therefore indirectly NNODES at docker build time. Also updated the actions workflow to override the default values in the dockerfile to specify NNODES=12 (#33). the header files do not use the dynamic values yet
zachsa added a commit that referenced this issue Apr 17, 2023
…nditional assignment in the actions workflow yml (#33)
zachsa added a commit that referenced this issue Apr 17, 2023
…environment variables prior to precomplication (#33)
@zachsa
Copy link
Collaborator Author

zachsa commented Apr 18, 2023

Next steps

  1. Move the param.h configuration code from the Dockerfile into the run_croco.bash script
  2. Move the compilation code from the Dockerfile into the run_croco.bash script
  3. Apply the same changes to False Bay model

@GilesFearon
Copy link
Collaborator

Documenting the progress like this is really cool! Would it be possible to do the compilation step as a separate "job" in the algoa and false bay workflows? Rather than all in the run execution step. This way the user could test the code compilation separately from running the model? Or is this creating unnecessary work?

@zachsa
Copy link
Collaborator Author

zachsa commented Apr 18, 2023

That's currently how it's done. The croco executable is baked into the croco image. So there are two jobs

  1. Compile the croco exe. https://github.com/SAEON/somisana/blob/stable/.github/workflows/run_algoa-bay-forecast.yml#L63
  2. Run the croco exe. https://github.com/SAEON/somisana/blob/stable/.github/workflows/run_algoa-bay-forecast.yml#L300

@zachsa
Copy link
Collaborator Author

zachsa commented Apr 18, 2023

If you build the same image twice it's usually quick the second time because all the image layers are cached locally

At the moment the compile croco step is run on a shared Microsoft runner so there's no caching. Still quick though.

There are also other options. Like an image with all the libs for running the fortran compiler, that you then use to compile various models. This might be a good idea for supporting lots and lots of models

@zachsa zachsa added the enhancement New feature or request label Apr 18, 2023
@GilesFearon
Copy link
Collaborator

I guess what I mean is the Compile the croco exe step would be split into two steps - one would be to create a docker image with all the packages installed to do the compilation, and the other would be the compilation itself which just runs the docker image while providing the (configurable) param.h and cppdefs.h inputs. Does that make sense?

Hey you shouldn't be looking at this today!

@zachsa
Copy link
Collaborator Author

zachsa commented Apr 19, 2023

Does the exe need all the installed libs to run after it's been compiled?

Yes - splitting the compilation / execution sounds like a good idea. Then we can have a container running that we can pass a grid/param.h/cppdefs.h, and get back a compiled app. Then we can run that compiled app on another docker container built on the same image.

This would support a workflow where users can define a grid on a UI, and get back an exe that they can then schedule for running.

On the web side of things, creating grids/exes from user interactions on a client involves sending the UI information to a web server, and then that web server (node.js currently) needs to run the commands to create the grid/compile the model. There are a couple ways to do this:

  1. The web server docker image can have all the libs installed
  2. You can mount the host docker socket into the webserver container, install the docker cli into the container, and run docker commands on the host. But this isn't such a good idea that gives root access to the host from within the container
  3. You can pass the UI information to another docker container that will then pass back grids/exes.

(1) or (3) is best. I like (3) as otherwise the Docker image for the somisana webserver is heavier than needs to be, and (3) will allow for easier scaling.

@GilesFearon
Copy link
Collaborator

"Does the exe need all the installed libs to run after it's been compiled?" I think so yes, the exe would need to be run on the same system on which it was compiled, with all the installed libs. I think that would be achieved with the workflow you describe? Using the same image to first do the compilation and then run the exe?

As for the web side, you are describing a situation where the user builds the grid from a UI? This sounds really cool, but not something I see us working on in the near future. I think there is a lot of scope for UI interaction with our existing models, particularly the particle tracking stuff.

@zachsa
Copy link
Collaborator Author

zachsa commented Apr 19, 2023

Yes - not on the cards yet, but nice to structure work in such a way to easily support this going forward (where it's easy to do that)

@zachsa
Copy link
Collaborator Author

zachsa commented Apr 19, 2023

The CROCO executable can now be configured at build time to support different NNODES configurations. Next step is to make that configurable at runtime, and also reassess how the Docker image is used. This continues at #35

@zachsa zachsa closed this as completed Apr 19, 2023
@zachsa zachsa reopened this Apr 19, 2023
@zachsa zachsa linked a pull request Apr 19, 2023 that will close this issue
zachsa added a commit that referenced this issue Apr 19, 2023
Updated documentation to reflect usage discussed in #33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants