diff --git a/README.md b/README.md index 0ff1ed8..ad2cde0 100644 --- a/README.md +++ b/README.md @@ -1,23 +1,31 @@ # `ai-models` For All -This repository is intended to provide a template that users can adapt to start generating their own weather forecasts leveraging the "pure AI" NWP systems recently developed and open-sourced by Huawei, Nvidia, and Google DeepMind. We boot-strap on top of the fantastic [`ai-models`](https://github.com/ecmwf-lab/ai-models) repository published by ECMWF Labs, but implement our own wrappers to help end-users quickly get started generating their own forecasts. +This package boot-straps on top of the fantastic [`ai-models`](https://github.com/ecmwf-lab/ai-models) library to build a serverless application to generate "pure AI +NWP" weather forecasts on [Modal](https://www.modal.com). Users can run their own +historical re-forecasts using either [PanguWeather](https://www.nature.com/articles/s41586-023-06185-3), +[FourCastNet](https://arxiv.org/abs/2202.11214), or [GraphCast](https://www.science.org/doi/10.1126/science.adi2336), +and save the outputs to their own cloud storage provider for further use. -This is a **preview release** of this tool; it has a few limitations: +The initial release of this application is fully-featured, but it has a few limitations: -- We only provide one storage adapter, for Google Cloud Storage (we can expand this - to S3, Azure, or other providers as there is interest). +- We only provide one storage adapter, for Google Cloud Storage. This can be generalized + to support S3, Azure, or any other provider in the future. - We only enable access to the CDS-based archive of ERA-5 data to initialize the - models (access via MARS will be forthcoming, but since most users will not have - these credentials, it wasn't a high priority). + models. To generate pseudo-operational forecasts, users should instead use the MARS + API, which offers access to very recent IFS analyses. However, given the licensing + restrictions of the AI models (PanguWeather and GraphCast are **forbidden** from being + used in commercial applications) and the cost/inaccessibility for most users, we defer + implementation of MARS for now. - The current application only runs on [Modal](https://www.modal.com); in the future, it - would be great to port this to other serverless platforms. + would be great to port this to other serverless platforms, re-using as much of the + core implementation as possible. -Furthermore, we significantly rely on the fantastic [`ecmwf-labs/ai-models`](https://github.com/ecmwf-lab/ai-models) +This application relies on the fantastic [`ecmwf-labs/ai-models`](https://github.com/ecmwf-lab/ai-models) package to automate a lot of the traditional MLOps that are necessary to run this type of product in a semi-production context. `ai-models` handles acquiring data to use as inputs for model inference (read: generate a forecast from initial conditions) by providing an as-needed interface with the Copernicus Data Store and MARS API, it -provides pre-trained model weights shipped via ONNX, it implements a simple interface +provides pre-trained model weights, it implements a simple ONNX-based interface for performing model inference, and it outputs a well-formed (albeit in GRIB) output file that can be fed into downstream workflows (e.g. model visualization). We don't anticipate replacing this package, but we may contribute improvements and features @@ -28,15 +36,17 @@ files per model step, with metadata following CF conventions) as they mature her ## Usage / Restrictions -If you use this package, please give credit to [Daniel Rothenberg](https://github.com/darothen) +If you use this application, please give credit to [Daniel Rothenberg](https://github.com/darothen) ( or [@danrothenberg](https://twitter.com/danrothenberg)), as well as the incredible team at [ECMWF Lab](https://github.com/ecmwf-lab) and the publishers of any forecast model you use. -**NOTE THAT EACH MODEL PROVIDED BY AI-MODELS HAS ITS OWN LICENSE AND RESTRICTION**. +**NOTE THAT EACH FORECAST MODEL PROVIDED BY AI-MODELS HAS ITS OWN LICENSE AND RESTRICTIONS**. This package may *only* be used in a manner compliant with the licenses and terms of all -the libraries, model weights, and services upon which it is built. +the libraries, model weights, and application platforms/services upon which it is built. +The forecasts generated by the AI models and the software which power them are *experimental in nature* +and may break or fail unexpectedly during normal use. ## Quick Start @@ -49,12 +59,14 @@ the libraries, model weights, and services upon which it is built. before running the application!** 3. From a terminal, login with the `modal-client` 4. Navigate to the repository on-disk and execute the command, + ```shell $ modal run ai-models-modal.main [\ --model-name {panguweather,fourcastnetv2-small,graphcast} \ - --model_init 2023-07-01T00:00 \ - --lead_time 12] + --model-init 2023-07-01T00:00:00 \ + --lead-time 12] ``` + The first time you run this, it will take a few minutes to build an image and set up assets on Modal. Then, the model will run remotely on Modal infrastructure, and you can monitor its progress via the logs streamed to your terminal. The bracketed CLI @@ -62,7 +74,7 @@ the libraries, model weights, and services upon which it is built. 5. Download the model output from Google Cloud Storage at **gs://{GCS_BUCKET_NAME}** as provided via the `.env` file. -## Getting Started +## More Detailed Setup Instructions To use this demo, you'll need accounts set up on [Google Cloud](https://cloud.google.com), [Modal](https://www.modal.com), and the [Copernicus Data Store](). @@ -164,4 +176,21 @@ easiest way to set this up would be to have the user retrieve their credentials from [here](https://cds.climate.copernicus.eu/api-how-to) and save them to a local file, `~/.cdsapirc`. But that's a tad inconvenient to build into our application image. Instead, we can just set the environment variables -**CDSAPI_URL** and **CDSAPI_KEY**. \ No newline at end of file +**CDSAPI_URL** and **CDSAPI_KEY**. Note that we still create a stub RC file during +image generation, but this is a shortcut so that users only need to modify a single +file with their credentials. + +## Other Notes + +- The code here has liberal comments detailing development notes, caveats, gotchas, + and opportunities. +- You may see some diagnostic outputs indicating that libraries including libexpat, + libglib, and libtpu are missing. These should not impact the functionality of the + current application (we've tested that all three AI models do in fact run + and produce expected outputs). +- It should be *very* cheap to run this application; even accounting for the time it + takes to download model assets the first time a given AI model is run, most of the + models can produce a 10-day forecast in about 10-15 minutes. So end-to-end, for a + long forecast, the GPU container should really only be running for < 20 minutes, + which means that at today's (11-25-2023) market rates of $3.73/hr per A100 GPU, it + should cost about a bit more than a dollar to generate a forecast, all-in. \ No newline at end of file diff --git a/ai-models-modal/main.py b/ai-models-modal/main.py index 1f8767d..c91158d 100644 --- a/ai-models-modal/main.py +++ b/ai-models-modal/main.py @@ -137,10 +137,13 @@ def __enter__(self): model_args={}, assets_sub_directory=None, staging_dates=None, + # TODO: Figure out if we can set up caching of model initial conditions + # using the default interface. archive_requests=False, only_gpu=True, - debug=False, # Assumed set by GraphcastModel; produces additional auxiliary + # Assumed set by GraphcastModel; produces additional auxiliary # output NetCDF files. + debug=False, ) logger.info("... done! Model is initialized and ready to run.") @@ -229,7 +232,7 @@ def generate_forecast( logger.warning("Not able to access to Google Cloud Storage; skipping upload.") return - logger.info("Attempting to upload to GCS bucket gs://{bucket_name}...") + logger.info(f"Attempting to upload to GCS bucket gs://{bucket_name}...") gcs_handler = gcs.GoogleCloudStorageHandler.with_service_account_info( service_account_info )