diff --git a/bundle/pythonic_usage_guidance/README.md b/bundle/pythonic_usage_guidance/README.md new file mode 100644 index 000000000..6fc779635 --- /dev/null +++ b/bundle/pythonic_usage_guidance/README.md @@ -0,0 +1,39 @@ +# Pythonic Bundle Access Tutorial + +A MONAI bundle contains the stored weights of a model, training, inference, post-processing transform sequences and other information. This tutorial aims to explore how to access a bundle in Python and use it in your own application. We'll cover the following topics: +1. Downloading the Bundle. +2. Creating a `BundleWorkflow`. +3. Getting Properties from the Bundle. +4. Updating Properties. +5. Using Components in Your Own Pipeline. +6. Utilizing Pretrained Weights from the Bundle. +7. A Simple Comparison of the Usage between `ConfigParser` and `BundleWorkflow`. + +The example training dataset is Task09_Spleen.tar from http://medicaldecathlon.com/. + +## Requirements + +The script is tested with: + +- `Ubuntu 20.04` | `Python 3.8.10` | `CUDA 12.2` | `Pytorch 1.13.1` + +- it is tested on 24gb single-gpu machine + +## Dependencies and installation + +### MONAI + +You can conda environments to install the dependencies. + +or you can just use MONAI docker. +```bash +docker pull projectmonai/monai:latest +``` + +For more information please check out [the installation guide](https://docs.monai.io/en/latest/installation.html). + +## Questions and bugs + +- For questions relating to the use of MONAI, please use our [Discussions tab](https://github.com/Project-MONAI/MONAI/discussions) on the main repository of MONAI. +- For bugs relating to MONAI functionality, please create an issue on the [main repository](https://github.com/Project-MONAI/MONAI/issues). +- For bugs relating to the running of a tutorial, please create an issue in [this repository](https://github.com/Project-MONAI/Tutorials/issues). diff --git a/bundle/pythonic_usage_guidance/pythonic_bundle_access.ipynb b/bundle/pythonic_usage_guidance/pythonic_bundle_access.ipynb new file mode 100644 index 000000000..c0910e2e9 --- /dev/null +++ b/bundle/pythonic_usage_guidance/pythonic_bundle_access.ipynb @@ -0,0 +1,569 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "Copyright (c) MONAI Consortium \n", + "Licensed under the Apache License, Version 2.0 (the \"License\"); \n", + "you may not use this file except in compliance with the License. \n", + "You may obtain a copy of the License at \n", + "    http://www.apache.org/licenses/LICENSE-2.0 \n", + "Unless required by applicable law or agreed to in writing, software \n", + "distributed under the License is distributed on an \"AS IS\" BASIS, \n", + "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. \n", + "See the License for the specific language governing permissions and \n", + "limitations under the License.\n", + "\n", + "# Accessing a Bundle Workflow in Python\n", + "\n", + "In this guide, we'll explore how to access a bundle in Python and use it in your own application. We'll cover the following topics:\n", + "\n", + "1. **Downloading the Bundle**: First, you'll need to download the bundle from its source. This can be done using the `download` API.\n", + "\n", + "2. **Creating a `BundleWorkflow`**: Once you have the bundle, you can create a `BundleWorkflow` object by passing the path to the bundle file as an argument to `create_worflow`.\n", + "\n", + "3. **Getting Properties from the Bundle**: You can then retrieve the properties of the bundle by directly accessing them. For example, to get the version of the bundle, you can use `workflow.version`.\n", + "\n", + "4. **Updating Properties**: If you need to update any of the properties, you can do so by directly overwriting them. For example, to update the max epochs of the bundle, you can use `workflow.max_epochs = 10`.\n", + "\n", + "5. **Using Components in Your Own Pipeline**: Finally, you can use the components from the bundle in your own pipeline by accessing them through the `BundleWorkflow` object.\n", + "\n", + "6. **Utilizing Pretrained Weights from the Bundle**: You can conveniently employ pretrained weights from the bundle and customize them using the `load` API.\n", + "\n", + "7. **A Simple Comparison of the Usage between `ConfigParser` and `BundleWorkflow`**\n", + "\n", + "The bundle documentation and specification can be found here: https://docs.monai.io/en/stable/bundle_intro.html" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup environment" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "!python -c \"import monai\" || pip install -q \"monai-weekly[gdown, nibabel, tqdm, ignite]\"" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup imports" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "import os\n", + "import shutil\n", + "import tempfile\n", + "from pathlib import Path\n", + "import monai\n", + "from monai.engines import EnsembleEvaluator\n", + "from monai.networks.nets import SegResNet\n", + "from monai.transforms import MeanEnsembled, Compose\n", + "from monai.config import print_config\n", + "from monai.apps import download_and_extract\n", + "from monai.bundle import download, create_workflow, ConfigParser, load\n", + "\n", + "print_config()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Setup data directory\n", + "\n", + "You can specify a directory with the `MONAI_DATA_DIRECTORY` environment variable. \n", + "This allows you to save results and reuse downloads. \n", + "If not specified a temporary directory will be used." + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "/workspace/Data\n" + ] + } + ], + "source": [ + "directory = os.environ.get(\"MONAI_DATA_DIRECTORY\")\n", + "root_dir = tempfile.mkdtemp() if directory is None else directory\n", + "print(root_dir)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Download dataset\n", + "\n", + "Downloads and extracts the dataset. \n", + "The dataset comes from http://medicaldecathlon.com/." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "resource = \"https://msd-for-monai.s3-us-west-2.amazonaws.com/Task09_Spleen.tar\"\n", + "md5 = \"410d4a301da4e5b2f6f86ec3ddba524e\"\n", + "\n", + "compressed_file = os.path.join(root_dir, \"Task09_Spleen.tar\")\n", + "data_dir = os.path.join(root_dir, \"Task09_Spleen\")\n", + "if not os.path.exists(data_dir):\n", + " download_and_extract(resource, compressed_file, root_dir, md5)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Downloading the Bundle" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "name spleen_ct_segmentation\n", + "version None\n", + "bundle_dir /workspace/Data\n", + "source github\n", + "repo None\n", + "url None\n", + "remove_prefix monai_\n", + "progress True\n", + "2023-09-06 08:44:11,165 - INFO - --- input summary of monai.bundle.scripts.download ---\n", + "2023-09-06 08:44:11,167 - INFO - > name: 'spleen_ct_segmentation'\n", + "2023-09-06 08:44:11,167 - INFO - > bundle_dir: '/workspace/Data'\n", + "2023-09-06 08:44:11,167 - INFO - > source: 'github'\n", + "2023-09-06 08:44:11,167 - INFO - > remove_prefix: 'monai_'\n", + "2023-09-06 08:44:11,168 - INFO - > progress: True\n", + "2023-09-06 08:44:11,168 - INFO - ---\n", + "\n", + "\n", + "2023-09-06 08:44:12,165 - INFO - Expected md5 is None, skip md5 check for file /workspace/Data/spleen_ct_segmentation_v0.5.3.zip.\n", + "2023-09-06 08:44:12,165 - INFO - File exists: /workspace/Data/spleen_ct_segmentation_v0.5.3.zip, skipped downloading.\n", + "2023-09-06 08:44:12,166 - INFO - Writing into directory: /workspace/Data.\n" + ] + } + ], + "source": [ + "download(name=\"spleen_ct_segmentation\", bundle_dir=root_dir)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Creating a `BundleWorkflow`\n", + "In this section, we demonstrate how to create and initialize a `BundleWorkflow` using the `create_workflow` function. This function supports the creation of both config-based bundles by providing the necessary config files and python-based bundles by specifying a bundle workflow name. The specified name should be a subclass of `BundleWorkflow` and be accessible for import.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "workflow_name None\n", + "config_file /workspace/Data/spleen_ct_segmentation/configs/train.json\n", + "workflow_type train\n", + "2023-09-06 09:18:47,393 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-09-06 09:18:47,395 - INFO - > config_file: '/workspace/Data/spleen_ct_segmentation/configs/train.json'\n", + "2023-09-06 09:18:47,396 - INFO - > workflow_type: 'train'\n", + "2023-09-06 09:18:47,397 - INFO - ---\n", + "\n", + "\n", + "2023-09-06 09:18:47,397 - INFO - Setting logging properties based on config: /workspace/Data/spleen_ct_segmentation/configs/logging.conf.\n" + ] + } + ], + "source": [ + "config_file = Path(root_dir) / \"spleen_ct_segmentation\" / \"configs\" / \"train.json\"\n", + "\n", + "train_workflow = create_workflow(config_file=str(config_file), workflow_type=\"train\")" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Getting Properties from the Bundle\n", + "To access properties from the bundle, please refer to the list of supported properties available [here](https://docs.monai.io/en/latest/mb_properties.html).\n", + "\n", + "You can also utilize the `add_property` method of the `BundleWorkflow` object to add properties for application requirements checking and access.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + } + ], + "source": [ + "# for existing properties\n", + "# Note that the properties got from `train_workflow` is already instantiated.\n", + "train_preprocessing = train_workflow.train_preprocessing\n", + "\n", + "# for meta information\n", + "version = train_workflow.version\n", + "\n", + "# add properties\n", + "train_workflow.add_property(name=\"lr_scheduler\", required=True, config_id=\"lr_scheduler\")\n", + "print(train_workflow.lr_scheduler)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Updating Properties\n", + "There are two primary methods for updating properties:\n", + "\n", + "1. You can override them during the workflow creation process.\n", + "2. Alternatively, you can directly overwrite them after the workflow has been created." + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "workflow_name None\n", + "config_file /workspace/Data/spleen_ct_segmentation/configs/train.json\n", + "workflow_type train\n", + "epochs 1\n", + "dataset_dir /workspace/Data/Task09_Spleen\n", + "bundle_root /workspace/Data\n", + "2023-09-06 06:51:08,123 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-09-06 06:51:08,124 - INFO - > config_file: PosixPath('/workspace/Data/spleen_ct_segmentation/configs/train.json')\n", + "2023-09-06 06:51:08,125 - INFO - > workflow_type: 'train'\n", + "2023-09-06 06:51:08,126 - INFO - > epochs: 1\n", + "2023-09-06 06:51:08,126 - INFO - > dataset_dir: '/workspace/Data/Task09_Spleen'\n", + "2023-09-06 06:51:08,126 - INFO - > bundle_root: '/workspace/Data'\n", + "2023-09-06 06:51:08,127 - INFO - ---\n", + "\n", + "\n", + "2023-09-06 06:51:08,127 - INFO - Setting logging properties based on config: /workspace/Data/spleen_ct_segmentation/configs/logging.conf.\n", + "max epochs: 1\n" + ] + } + ], + "source": [ + "# 1 override them when you create the workflow\n", + "bundle_root = root_dir\n", + "override = {\"epochs\": 1, \"dataset_dir\": str(data_dir), \"bundle_root\": bundle_root}\n", + "train_workflow = create_workflow(config_file=config_file, workflow_type=\"train\", **override)\n", + "print(\"max epochs:\", train_workflow.max_epochs)" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "max epochs: 3\n", + "bundle root: /workspace/Data\n" + ] + } + ], + "source": [ + "# 2 directly overwriting them after creating the workflow\n", + "train_workflow.max_epochs = 3\n", + "train_workflow.bundle_root = bundle_root\n", + "\n", + "# Note that must initialize again after changing the content\n", + "train_workflow.initialize()\n", + "print(\"max epochs:\", train_workflow.max_epochs)\n", + "print(\"bundle root:\", train_workflow.bundle_root)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Using Components in Your Own Pipeline\n", + "If you wish to incorporate additional processing into the bundle's existing post-processing and use it within your custom pipeline, you can follow these steps. A comprehensive example can be found [here](https://github.com/Project-MONAI/tutorials/tree/main/model_zoo/app_integrate_bundle)." + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "(, )\n", + "2023-09-06 06:53:28,284 - ignite.engine.engine.EnsembleEvaluator - INFO - Engine run resuming from iteration 0, epoch 0 until 1 epochs\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "property 'dataloader' already exists in the properties list, overriding it.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2023-09-06 06:53:31,896 - ignite.engine.engine.EnsembleEvaluator - INFO - Epoch[1] Complete. Time taken: 00:00:03.305\n", + "2023-09-06 06:53:31,897 - ignite.engine.engine.EnsembleEvaluator - INFO - Engine run complete. Time taken: 00:00:03.612\n" + ] + } + ], + "source": [ + "n_splits = 3\n", + "ensemble_transform = MeanEnsembled(keys=[\"pred\"] * n_splits, output_key=\"pred\")\n", + "update_postprocessing = Compose((ensemble_transform, train_workflow.val_postprocessing))\n", + "print(update_postprocessing.transforms)\n", + "\n", + "device = train_workflow.device\n", + "train_workflow.add_property(name=\"dataloader\", required=True, config_id=\"train#dataloader\")\n", + "evaluator = EnsembleEvaluator(\n", + " device=device,\n", + " val_data_loader=train_workflow.dataloader,\n", + " pred_keys=[\"pred\"] * n_splits,\n", + " networks=[train_workflow.network_def.to(train_workflow.device)] * n_splits,\n", + " inferer=train_workflow.train_inferer,\n", + " postprocessing=update_postprocessing,\n", + ")\n", + "\n", + "evaluator.run()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Utilizing Pretrained Weights from the Bundle\n", + "\n", + "This function primarily serves to provide an instantiated network by loading pretrained weights from the bundle. You have the flexibility to directly update the parameters or filter the weights. Additionally, it's possible to use your own model instead of the one included in the bundle.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# directly get an instantiated network that loaded the weights.\n", + "model = load(name=\"brats_mri_segmentation\", bundle_dir=root_dir, source=\"monaihosting\")\n", + "\n", + "# directly update the parameters for the model from the bundle.\n", + "model = load(name=\"brats_mri_segmentation\", bundle_dir=root_dir, source=\"monaihosting\", in_channels=3, out_channels=1)\n", + "\n", + "# using `exclude_vars` to filter loading weights.\n", + "model = load(\n", + " name=\"brats_mri_segmentation\",\n", + " bundle_dir=root_dir,\n", + " source=\"monaihosting\",\n", + " copy_model_args={\"exclude_vars\": \"convInit|conv_final\"},\n", + ")\n", + "\n", + "# pass model and return an instantiated network that loaded the weights.\n", + "my_model = SegResNet(blocks_down=[1, 2, 2, 4], blocks_up=[1, 1, 1], init_filters=16, in_channels=1, out_channels=3)\n", + "model = load(name=\"brats_mri_segmentation\", bundle_dir=root_dir, source=\"monaihosting\", model=my_model)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## A Simple Comparison of the Usage between `ConfigParser` and `BundleWorkflow`" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Loading Configuration\n", + "\n", + "In the past, we needed to instantiate a `ConfigParser` and then read the configuration file and meta information using `read_config` and `read_meta` functions. However, now you can skip using `ConfigParser` and directly run `create_workflow` to create a `BundleWorkflow`. This new approach supports both configuration-based and Python-based bundles.\n" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "workflow_name None\n", + "config_file /workspace/Data/spleen_ct_segmentation/configs/train.json\n", + "workflow_type train\n", + "2023-09-06 09:19:00,935 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-09-06 09:19:00,936 - INFO - > config_file: '/workspace/Data/spleen_ct_segmentation/configs/train.json'\n", + "2023-09-06 09:19:00,938 - INFO - > workflow_type: 'train'\n", + "2023-09-06 09:19:00,939 - INFO - ---\n", + "\n", + "\n", + "2023-09-06 09:19:00,940 - INFO - Setting logging properties based on config: /workspace/Data/spleen_ct_segmentation/configs/logging.conf.\n" + ] + } + ], + "source": [ + "# Using ConfigParser\n", + "meta_file = Path(root_dir) / \"spleen_ct_segmentation\" / \"configs\" / \"metadata.json\"\n", + "bundle_config = ConfigParser()\n", + "bundle_config.read_config(config_file)\n", + "bundle_config.read_meta(meta_file)\n", + "\n", + "# Using BundleWorkflow\n", + "# config-based\n", + "workflow = create_workflow(config_file=str(config_file), workflow_type=\"train\")\n", + "# python-based\n", + "# more details refer to https://github.com/Project-MONAI/tutorials/tree/main/bundle/python_bundle_workflow\n", + "# workflow = create_workflow(workflow_name=scripts.train.TrainWorkflow)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Getting and Updating Configuration\n", + "\n", + "Previously, we utilized the `update` method to override configuration content. Now, with `BundleWorkflow`, you can override contents during workflow creation. To obtain an instantiated component, we used to use `get_parsed_content` before. However, now you can access it directly. Additionally, it's worth noting that you can also override the instantiated component, but be sure to initialize it again as needed.\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "overrides = {\"network_def#in_channels\": 1, \"lr_scheduler#step_size\": 4000, \"dataset_dir\": str(data_dir), \"epochs\": 1}\n", + "\n", + "# Using ConfigParser\n", + "# override configuration content\n", + "bundle_config.config.update(overrides)\n", + "# get instantiate the network component\n", + "net = bundle_config.get_parsed_content(\"network_def\", instantiate=True)\n", + "\n", + "# Using BundleWorkflow\n", + "workflow = create_workflow(config_file=str(config_file), workflow_type=\"train\", **overrides)\n", + "# get instantiate the network component\n", + "net = workflow.network_def\n", + "workflow.network_def = SegResNet(\n", + " blocks_down=[1, 2, 2, 4], blocks_up=[1, 1, 1], init_filters=16, in_channels=1, out_channels=2\n", + ")\n", + "workflow.initialize() # re-initialize the workflow after changing the content\n", + "print(workflow.network_def)" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "### Running the Updated Bundle\n", + "\n", + "In the past, running an updated configuration required using `export_config_file` to export the new configuration and using the `run` command. But now, you can streamline the process by directly using the `run` command to execute the new workflow." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "# Using ConfigParser\n", + "new_config_path = Path(root_dir) / \"spleen_ct_segmentation\" / \"configs\" / \"new_train_config.json\"\n", + "ConfigParser.export_config_file(bundle_config.config, str(new_config_path), indent=2)\n", + "monai.bundle.run(run_id=\"run\", init_id=None, final_id=None, meta_file=str(meta_file), config_file=str(new_config_path))\n", + "\n", + "# Using BundleWorkflow\n", + "workflow.run()" + ] + }, + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "## Cleanup data directory\n", + "\n", + "Remove directory if a temporary was used." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": {}, + "outputs": [], + "source": [ + "if directory is None:\n", + " shutil.rmtree(root_dir)" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.8.10" + } + }, + "nbformat": 4, + "nbformat_minor": 2 +}