Skip to content

Latest commit

 

History

History
273 lines (221 loc) · 15.1 KB

README.md

File metadata and controls

273 lines (221 loc) · 15.1 KB
LagrangeBench Logo: Lagrangian Fluid Mechanics Benchmarking Suite

Paper Docs PyPI - Version Open In Colab Discord

Tests CodeCov License

NeurIPS page with video and slides here.

Table of Contents

  1. Installation
  2. Usage
  3. Datasets
  4. Pretrained Models
  5. Directory Structure
  6. Contributing
  7. Citation

Installation

Standalone library

Install the core lagrangebench library from PyPi as

python3.10 -m venv venv
source venv/bin/activate
pip install lagrangebench --extra-index-url=https://download.pytorch.org/whl/cpu

Note that by default lagrangebench is installed without JAX GPU support. For that follow the instructions in the GPU support section.

Clone

Clone this GitHub repository

git clone https://github.com/tumaer/lagrangebench.git
cd lagrangebench

Install the dependencies with Poetry (>=1.6.0)

poetry install --only main

Alternatively, a requirements file is provided. It directly installs the CUDA version of JAX.

pip install -r requirements_cuda.txt

For a CPU version of the requirements file, one could use docs/requirements.txt.

GPU support

To run JAX on GPU, follow Installing JAX, or in general run

pip install -U "jax[cuda12]==0.4.29"

Note: as of 27.06.2024, to make our GNN models deterministic on GPUs, you need to set os.environ["XLA_FLAGS"] = "--xla_gpu_deterministic_ops=true". However, all current models rely of scatter_sum, and this operation seems to be slower than running a normal for-loop in Python, when executed in deterministic mode, see #17844 and #10674.

MacOS

Currently, only the CPU installation works. You will need to change a few small things to get it going:

  • Clone installation: in pyproject.toml change the torch version from 2.1.0+cpu to 2.1.0. Then, remove the poetry.lock file and run poetry install --only main.
  • Configs: You will need to set dtype=float32 and train.num_workers=0.

Although the current jax-metal==0.0.5 library supports jax in general, there seems to be a missing feature used by jax-md related to padding -> see this issue.

Usage

Standalone benchmark library

A general tutorial is provided in the example notebook "Training GNS on the 2D Taylor Green Vortex" under ./notebooks/tutorial.ipynb on the LagrangeBench repository. The notebook covers the basics of LagrangeBench, such as loading a dataset, setting up a case, training a model from scratch and evaluating its performance.

Running in a local clone (main.py)

Alternatively, experiments can also be set up with main.py, based on extensive YAML config files and cli arguments (check configs/). By default, the arguments have priority as 1) passed cli arguments, 2) YAML config and 3) defaults.py (lagrangebench defaults).

When loading a saved model with load_ckp the config from the checkpoint is automatically loaded and training is restarted. For more details check the runner.py file.

Train

For example, to start a GNS run from scratch on the RPF 2D dataset use

python main.py config=configs/rpf_2d/gns.yaml

Some model presets can be found in ./configs/.

If mode=all is provided, then training (mode=train) and subsequent inference (mode=infer) on the test split will be run in one go.

Restart training

To restart training from the last checkpoint in load_ckp use

python main.py load_ckp=ckp/gns_rpf2d_yyyymmdd-hhmmss

Inference

To evaluate a trained model from load_ckp on the test split (test=True) use

python main.py load_ckp=ckp/gns_rpf2d_yyyymmdd-hhmmss/best rollout_dir=rollout/gns_rpf2d_yyyymmdd-hhmmss/best mode=infer test=True

If the default eval.infer.out_type=pkl is active, then the generated trajectories and a metricsYYYY_MM_DD_HH_MM_SS.pkl file will be written to eval.rollout_dir. The metrics file contains all eval.infer.metrics properties for each generated rollout.

Notebooks

We provide three notebooks that show LagrangeBench functionalities, namely:

Datasets

The datasets are hosted on Zenodo under the DOI: 10.5281/zenodo.10021925. If a dataset is not found in dataset.src, the data is automatically downloaded. Alternatively, to manually download the datasets use the download_data.sh shell script, either with a specific dataset name or "all". Namely

  • Taylor Green Vortex 2D: bash download_data.sh tgv_2d datasets/
  • Reverse Poiseuille Flow 2D: bash download_data.sh rpf_2d datasets/
  • Lid Driven Cavity 2D: bash download_data.sh ldc_2d datasets/
  • Dam break 2D: bash download_data.sh dam_2d datasets/
  • Taylor Green Vortex 3D: bash download_data.sh tgv_3d datasets/
  • Reverse Poiseuille Flow 3D: bash download_data.sh rpf_3d datasets/
  • Lid Driven Cavity 3D: bash download_data.sh ldc_3d datasets/
  • All: bash download_data.sh all datasets/

Pretrained Models

We provide pretrained model weights of our default GNS and SEGNN models on each of the 7 LagrangeBench datasets. You can download and run the checkpoints given below. In the table, we also provide the 20-step error measures on the full test split.

Dataset Model MSE20 Sinkhorn MSEEkin
2D TGV GNS-10-128 5.9e-6 3.2e-7 4.9e-7
SEGNN-10-64 4.4e-6 2.1e-7 5.0e-7
2D RPF GNS-10-128 4.0e-6 2.5e-7 2.7e-5
SEGNN-10-64 3.4e-6 2.5e-7 1.4e-5
2D LDC GNS-10-128 1.5e-5 1.1e-6 6.1e-7
SEGNN-10-64 2.1e-5 3.7e-6 1.6e-5
2D DAM GNS-10-128 3.1e-5 1.4e-5 1.1e-4
SEGNN-10-64 4.1e-5 2.3e-5 5.2e-4
3D TGV GNS-10-128 5.8e-3 4.7e-6 4.8e-2
SEGNN-10-64 5.0e-3 4.9e-6 3.9e-2
3D RPF GNS-10-128 2.1e-5 3.3e-7 1.8e-6
SEGNN-10-64 1.7e-5 2.7e-7 1.7e-6
3D LDC GNS-10-128 4.1e-5 3.2e-7 1.9e-8
SEGNN-10-64 4.1e-5 2.9e-7 2.5e-8

To reproduce the numbers in the table, e.g., on 2D TGV with GNS, follow these steps:

# download the checkpoint (1) through the browser or 
# (2) using the file ID from the URL, i.e., for 2D TGV + GNS
gdown 19TO4PaFGcryXOFFKs93IniuPZKEcaJ37
# unzip the downloaded file `gns_tgv2d.zip`
python -c "import shutil; shutil.unpack_archive('gns_tgv2d.zip', 'gns_tgv2d')"
# evaluate the model on the test split
python main.py gpu=$GPU_ID mode=infer eval.test=True load_ckp=gns_tgv2d/best

Directory structure

📦lagrangebench
 ┣ 📂case_setup     # Case setup manager
 ┃ ┣ 📜case.py      # CaseSetupFn class
 ┃ ┗ 📜features.py  # Feature extraction
 ┣ 📂data           # Datasets and dataloading utils
 ┃ ┣ 📜data.py      # H5Dataset class and specific datasets
 ┃ ┗ 📜utils.py
 ┣ 📂evaluate       # Evaluation and rollout generation tools
 ┃ ┣ 📜metrics.py
 ┃ ┣ 📜rollout.py
 ┃ ┗ 📜utils.py
 ┣ 📂models         # Baseline models
 ┃ ┣ 📜base.py      # BaseModel class
 ┃ ┣ 📜egnn.py
 ┃ ┣ 📜gns.py
 ┃ ┣ 📜linear.py
 ┃ ┣ 📜painn.py
 ┃ ┣ 📜segnn.py
 ┃ ┗ 📜utils.py
 ┣ 📂train          # Trainer method and training tricks
 ┃ ┣ 📜strats.py    # Training tricks
 ┃ ┗ 📜trainer.py   # Trainer method
 ┣ 📜defaults.py    # Default values
 ┣ 📜runner.py      # Runner wrapping training and inference
 ┗ 📜utils.py

Contributing

Welcome! We highly appreciate Github issues and PRs.

You can also chat with us on Discord.

Contributing Guideline

If you want to contribute to this repository, you will need the dev dependencies, i.e. install the environment with poetry install without the --only main flag. Then, we also recommend you install the pre-commit hooks if you don't want to manually run pre-commit run before each commit. To sum up:

git clone https://github.com/tumaer/lagrangebench.git
cd lagrangebench
poetry install
source $PATH_TO_LAGRANGEBENCH_VENV/bin/activate

# install pre-commit hooks defined in .pre-commit-config.yaml
# ruff is configured in pyproject.toml
pre-commit install

# if you want to bump the version in both pyproject.toml and __init__.py, do
poetry self add poetry-bumpversion
poetry version patch  # or minor/major

After you have run git add <FILE> and try to git commit, the pre-commit hook will fix the linting and formatting of <FILE> before you are allowed to commit.

You should also run the tests locally before creating a PR. Do this simply by:

# pytest is configured in pyproject.toml
pytest

Clone vs Library

LagrangeBench can be installed by cloning the repository or as a standalone library. This offers more flexibility, but it also comes with its disadvantages: the necessity to implement some things twice. If you change any of the following things, make sure to update its counterpart as well:

  • General setup in lagrangebench/runner.py and notebooks/tutorial.ipynb
  • Configs in configs/ and lagrangebench/defaults.py
  • Zenodo URLs in download_data.sh and lagrangebench/data/data.py
  • Dependencies in pyproject.toml, requirements_cuda.txt, and docs/requirements.txt
  • Library version in pyproject.toml and lagrangebench/__init__.py

Citation

The paper (at NeurIPS 2023 Datasets and Benchmarks) can be cited as:

@article{toshev2024lagrangebench,
  title={Lagrangebench: A lagrangian fluid mechanics benchmarking suite},
  author={Toshev, Artur and Galletti, Gianluca and Fritz, Fabian and Adami, Stefan and Adams, Nikolaus},
  journal={Advances in Neural Information Processing Systems},
  volume={36},
  year={2024}
}

The associated datasets can be cited as:

@dataset{toshev_2024_10491868,
  author       = {Toshev, Artur P. and Adams, Nikolaus A.},
  title        = {LagrangeBench Datasets},
  month        = jan,
  year         = 2024,
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.10491868},
  url          = {https://doi.org/10.5281/zenodo.10491868}
}

Publications

The following further publications are based on the LagrangeBench codebase:

  1. Learning Lagrangian Fluid Mechanics with E(3)-Equivariant Graph Neural Networks (GSI 2023), A. P. Toshev, G. Galletti, J. Brandstetter, S. Adami, N. A. Adams
  2. Neural SPH: Improved Neural Modeling of Lagrangian Fluid Dynamics (ICML 2024), A. P. Toshev, J. A. Erbesdobler, N. A. Adams, J. Brandstetter