diff --git a/bundle/01_bundle_intro.ipynb b/bundle/01_bundle_intro.ipynb new file mode 100644 index 0000000000..976e61e71e --- /dev/null +++ b/bundle/01_bundle_intro.ipynb @@ -0,0 +1,698 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "a7318b28-758a-41f3-a5cb-2b634dfe0100", + "metadata": {}, + "source": [ + "Copyright (c) MONAI Consortium \n", + "Licensed under the Apache License, Version 2.0 (the \"License\"); \n", + "you may not use this file except in compliance with the License. \n", + "You may obtain a copy of the License at \n", + "    http://www.apache.org/licenses/LICENSE-2.0 \n", + "Unless required by applicable law or agreed to in writing, software \n", + "distributed under the License is distributed on an \"AS IS\" BASIS, \n", + "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. \n", + "See the License for the specific language governing permissions and \n", + "limitations under the License." + ] + }, + { + "cell_type": "markdown", + "id": "45839c34-faf2-4f14-b28a-fd6ff635db34", + "metadata": {}, + "source": [ + "## Setup environment" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "1bf88e03-1c87-4901-9cfb-9c626d454b98", + "metadata": {}, + "outputs": [], + "source": [ + "!python -c \"import monai\" || pip install -q \"monai-weekly[ignite,pyyaml]\"" + ] + }, + { + "cell_type": "markdown", + "id": "2814d671-6db5-4a89-9237-46ed4a950594", + "metadata": {}, + "source": [ + "## Setup imports" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "280efd0a-74dd-41c7-8a2b-0de382dc0657", + "metadata": {}, + "outputs": [], + "source": [ + "from monai.config import print_config\n", + "\n", + "print_config()" + ] + }, + { + "cell_type": "markdown", + "id": "8e2cb6cb-8fc2-41cc-941b-ff2e37c4c043", + "metadata": {}, + "source": [ + "# MONAI Bundles\n", + "\n", + "Bundles are essentially _self-descriptive networks_. They combine a network definition with the metadata about what they are meant to do, what they are used for, the nature of their inputs and outputs, and scripts (possibly with associated data) to train and infer using them. \n", + "\n", + "The key objective with bundles is to provide a structured format for using and distributing your network along with all the added information needed to understand the network in context. This makes it easier for you and others to use the network, adapt it to different applications, reproduce your experiments and results, and simply document your work.\n", + "\n", + "The bundle documentation and specification can be found here: https://docs.monai.io/en/stable/bundle_intro.html\n", + "\n", + "## Bundle Structure\n", + "\n", + "A bundle consists of a named directory containing specific subdirectories for different parts. From the specification we have a basic outline of directories in this form (* means optional file):\n", + "\n", + "```\n", + "ModelName\n", + "┣━ configs\n", + "┃ ┗━ metadata.json\n", + "┣━ models\n", + "┃ ┣━ model.pt\n", + "┃ ┣━ *model.ts\n", + "┃ ┗━ *model.onnx\n", + "┣━ docs\n", + "┃ ┣━ *README.md\n", + "┃ ┗━ *license.txt\n", + "┗━ *scripts\n", + "```\n", + "\n", + "Here the `metadata.json` file will contain the name of the bundle, plain language description of what it does and intended purpose, a description of what the input and output values are for the network's forward pass, copyright information, and otherwise anything else you want to add. Further configuration files go into `configs` which will be JSON or YAML documents representing scripts in the form of Python object instantiations.\n", + "\n", + "The `models` directory contains the stored weights for your network which can be in multiple forms. The weight dictionary `model.pt` must be present but the Torchscript `model.ts` and ONNX `model.onnx` files representing the same network are optional. \n", + "\n", + "The `docs` directory will contain the readme file and any other documentation you want to include. Notebooks and images are good things to include for demonstrating use of the bundle.\n", + "\n", + "A further `scripts` directory can be included which would contain Python definitions of any sort to be used in the JSON/YAML script files. This directory should be a valid Python module if present, ie. contains a `__init__.py` file.\n", + "\n", + "## Instantiating a new bundle\n", + "\n", + "This notebook will introduce the concepts of the bundle and how to define your own. MONAI provides a number of bundle-related programs through the `monai.bundle` module using the Fire library. We can use `init_bundle` to start creating a bundle from scratch:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "a1d9d107-58d6-4ed8-9cf1-6e9103e78a92", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[01;34mTestBundle\u001b[00m\n", + "├── \u001b[01;34mconfigs\u001b[00m\n", + "│   ├── inference.json\n", + "│   └── metadata.json\n", + "├── \u001b[01;34mdocs\u001b[00m\n", + "│   └── README.md\n", + "├── LICENSE\n", + "└── \u001b[01;34mmodels\u001b[00m\n", + "\n", + "3 directories, 4 files\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "python -m monai.bundle init_bundle TestBundle\n", + "# you may need to install tree with \"sudo apt install tree\"\n", + "which tree && tree TestBundle || true" + ] + }, + { + "cell_type": "markdown", + "id": "99c6a04e-4859-4123-9433-6632bbd6ff0d", + "metadata": {}, + "source": [ + "Our new blandly-named bundle, `TestBundle`, doesn't have much in it currently. It has the directory structure so we can start putting definitions in the right places. The first thing we should do is fill in relevant information to the `metadata.json` file so that anyone who has our bundle knows what it is. The default is a template of common fields:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "3e19c030-4e03-4a96-a127-ee0daa604052", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"version\": \"0.0.1\",\n", + " \"changelog\": {\n", + " \"0.0.1\": \"Initial version\"\n", + " },\n", + " \"monai_version\": \"1.2.0\",\n", + " \"pytorch_version\": \"2.0.0\",\n", + " \"numpy_version\": \"1.23.5\",\n", + " \"optional_packages_version\": {},\n", + " \"task\": \"Describe what the network predicts\",\n", + " \"description\": \"A longer description of what the network does, use context, inputs, outputs, etc.\",\n", + " \"authors\": \"Your Name Here\",\n", + " \"copyright\": \"Copyright (c) Your Name Here\",\n", + " \"network_data_format\": {\n", + " \"inputs\": {},\n", + " \"outputs\": {}\n", + " }\n", + "}" + ] + } + ], + "source": [ + "!cat TestBundle/configs/metadata.json" + ] + }, + { + "cell_type": "markdown", + "id": "827c759b-9ae6-4ec1-a83d-9077bf23bafd", + "metadata": {}, + "source": [ + "We'll replace this with some more information that reflects our bundle being a demo:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "a56e4833-171c-432c-8145-f325fad3bfcb", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Overwriting TestBundle/configs/metadata.json\n" + ] + } + ], + "source": [ + "%%writefile TestBundle/configs/metadata.json\n", + "\n", + "{\n", + " \"version\": \"0.0.1\",\n", + " \"changelog\": {\n", + " \"0.0.1\": \"Initial version\"\n", + " },\n", + " \"monai_version\": \"1.2.0\",\n", + " \"pytorch_version\": \"2.0.0\",\n", + " \"numpy_version\": \"1.23.5\",\n", + " \"optional_packages_version\": {},\n", + " \"name\": \"TestBundle\",\n", + " \"task\": \"Demonstration Bundle Network\",\n", + " \"description\": \"This is a demonstration bundle meant to showcase features of the MONAI bundle system only and does nothing useful\",\n", + " \"authors\": \"Your Name Here\",\n", + " \"copyright\": \"Copyright (c) Your Name Here\",\n", + " \"network_data_format\": {\n", + " \"inputs\": {},\n", + " \"outputs\": {}\n", + " },\n", + " \"intended_use\": \"This is suitable for demonstration only\"\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "da6aa796-d4ae-423c-9215-957ad968b845", + "metadata": {}, + "source": [ + "## Configuration Files\n", + "\n", + "Configuration files define how to instantiate a number of Python objects and run simple routines. These files, whether JSON or YAML, are Python dictionaries containing expression lists or the arguments to be passed to a named constructor.\n", + "\n", + "The provided `inference.json` file is a demo of applying a network to a series of JPEG images. This illustrates some of the concepts around typical bundles, specifically how to declare MONAI objects to put a workflow together, but we're going to ignore that for now and create some YAML configuration files instead which do some very basic things. \n", + "\n", + "Whether you're working with JSON or YAML the config files are doing the same thing which is define a series of object instantiations with the expectation that this constitutes a workflow. Typically for training or inference with a network this would be defining data sources, loaders, transform sequences, and finally a subclass of the [Ignite Engine](https://docs.monai.io/en/stable/engines.html#workflow). A class like `SupervisedTrainer` is the driving program for training a network, so creating an instance of this along with its associated arguments then calling its `run()` method constitutes a workflow or \"program\". \n", + "\n", + "You don't have to use any specific objects types though so you're totally free to design your workflows to be whatever you like, but typically as demonstrated in the MONAI Model Zoo they'll be Ignite-based workflows doing training or inference. We'll start with a very simple workflow which actually just imports Pytorch and MONAI then prints diagnostic information:" + ] + }, + { + "cell_type": "code", + "execution_count": 32, + "id": "63322909-1a24-426e-a744-39452cdff14f", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing TestBundle/configs/test_config.yaml\n" + ] + } + ], + "source": [ + "%%writefile TestBundle/configs/test_config.yaml\n", + "\n", + "imports: \n", + "- $import torch\n", + "- $import monai\n", + "\n", + "device: $torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')\n", + "\n", + "shape: [4, 4]\n", + "\n", + "test_tensor: '$torch.rand(*@shape).to(@device)'\n", + "\n", + "test_config:\n", + "- '$monai.config.print_config()'\n", + "- '$print(\"Test tensor:\", @test_tensor)'" + ] + }, + { + "cell_type": "markdown", + "id": "c6c3d978-10d1-47ce-9171-2e4a4f7dbac1", + "metadata": {}, + "source": [ + "This file demonstrates a number of key concepts:\n", + "\n", + "* `imports` is a sequence of strings starting with `$` which indicate the string should be interpreted as a Python expression. These will be interpreted at the start of the execution so that modules can be imported into the running namespace. `imports` should be a sequence of such expressions.\n", + "* `device` is an object definition created by evaluating the given expression, in this case creating a Pytorch device object.\n", + "* `shape` is a list of literal values in YAML format we'll use elsewhere.\n", + "* `test_tensor` is another object created by evaluating an expression, this one uses references to `shape` and `device` with the `@` syntax.\n", + "* `test_config` is a list of expressions which are evaluated in order to act as the \"main\" or entry point for the program, in this case printing config information and then our created tensor.\n", + "\n", + "As mentioned `$` and `@` are sigils with special meaning. A string starting with `$` is treated as a Python expression and is evaluated as such when needed, these need to be enclosed in quotes only when JSON/YAML need that to parse correctly. A variable starting with `@` is treated as reference to something we've defined in the script, eg `@shape`, and will only work for such definitions. Accessing a member of a definition before being interpreted can be done with `#`, so something like `@foo#bar` will access the `bar` member of a definition `foo`. More information on the usage of these can be found at https://docs.monai.io/en/latest/config_syntax.html.\n", + "\n", + "We can run this \"program\" on the command line now using the bundle submodule and a few arguments to specify the metadata file and configuration file:" + ] + }, + { + "cell_type": "code", + "execution_count": 33, + "id": "7968ceb4-89ef-40a9-ac9b-f048c6cca73b", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2023-07-14 15:34:52,646 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-07-14 15:34:52,647 - INFO - > run_id: 'test_config'\n", + "2023-07-14 15:34:52,647 - INFO - > meta_file: './TestBundle/configs/metadata.json'\n", + "2023-07-14 15:34:52,647 - INFO - > config_file: './TestBundle/configs/test_config.yaml'\n", + "2023-07-14 15:34:52,647 - INFO - ---\n", + "\n", + "\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/home/localek10/workspace/monai/MONAI_mine/monai/bundle/workflows.py:213: UserWarning: Default logging file in 'configs/logging.conf' does not exist, skipping logging.\n", + " warnings.warn(\"Default logging file in 'configs/logging.conf' does not exist, skipping logging.\")\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "MONAI version: 1.2.0\n", + "Numpy version: 1.23.5\n", + "Pytorch version: 2.0.0\n", + "MONAI flags: HAS_EXT = False, USE_COMPILED = False, USE_META_DICT = False\n", + "MONAI rev id: c33f1ba588ee00229a309000e888f9817b4f1934\n", + "MONAI __file__: /home/localek10/workspace/monai/MONAI_mine/monai/__init__.py\n", + "\n", + "Optional dependencies:\n", + "Pytorch Ignite version: 0.4.12\n", + "ITK version: NOT INSTALLED or UNKNOWN VERSION.\n", + "Nibabel version: 5.0.1\n", + "scikit-image version: NOT INSTALLED or UNKNOWN VERSION.\n", + "Pillow version: 9.4.0\n", + "Tensorboard version: NOT INSTALLED or UNKNOWN VERSION.\n", + "gdown version: NOT INSTALLED or UNKNOWN VERSION.\n", + "TorchVision version: 0.15.0\n", + "tqdm version: 4.65.0\n", + "lmdb version: NOT INSTALLED or UNKNOWN VERSION.\n", + "psutil version: 5.9.0\n", + "pandas version: 1.5.3\n", + "einops version: 0.6.1\n", + "transformers version: NOT INSTALLED or UNKNOWN VERSION.\n", + "mlflow version: NOT INSTALLED or UNKNOWN VERSION.\n", + "pynrrd version: NOT INSTALLED or UNKNOWN VERSION.\n", + "\n", + "For details about installing the optional dependencies, please visit:\n", + " https://docs.monai.io/en/latest/installation.html#installing-the-recommended-dependencies\n", + "\n", + "Test tensor: tensor([[0.5281, 0.1114, 0.5124, 0.2523],\n", + " [0.6561, 0.0298, 0.6393, 0.8636],\n", + " [0.3730, 0.8315, 0.1390, 0.6233],\n", + " [0.2646, 0.8929, 0.5250, 0.0472]], device='cuda:0')\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "# convenient to define the bundle's root in a variable\n", + "BUNDLE=\"./TestBundle\"\n", + "\n", + "# loads the test_config.yaml file and runs the test_config program it defines\n", + "python -m monai.bundle run test_config \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"$BUNDLE/configs/test_config.yaml\"" + ] + }, + { + "cell_type": "markdown", + "id": "4a28777d-b44a-4c78-b81b-a946b7f4ec30", + "metadata": {}, + "source": [ + "Here the `run` routine is invoked and the name of the \"main\" sequence of expressions is given (`test_config`). MONAI will then load and interpret the config then evaluate the expressions of `test_config` in order. Definitions in the configuratoin which aren't needed to do this are ignored, so you can provide multiple expression lists that run different parts of your script without having to create everything. " + ] + }, + { + "cell_type": "markdown", + "id": "d1c00118-a695-4629-a454-3fda51c57232", + "metadata": {}, + "source": [ + "## Object Instantiation\n", + "\n", + "Creating objects is a key concept in config files which would be cumbersome if done only through expressions as has been demonstrated here. Instead, an object can be defined by a dictionary of values naming first the type with `_target_` and then providing the constructor arguments as named values. The following is a simple example creat a `Dataset` class with a very simple set of values:" + ] + }, + { + "cell_type": "code", + "execution_count": 30, + "id": "cb762695-8c5d-4f42-9c29-eb6260990b0c", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Overwriting TestBundle/configs/test_object.yaml\n" + ] + } + ], + "source": [ + "%%writefile TestBundle/configs/test_object.yaml\n", + "\n", + "datadicts: '$[{i: (i * i)} for i in range(10)]' # create a fake dataset as a list of dicts\n", + "\n", + "test_dataset: # creates an instance of an object because _target_ is present\n", + " _target_: Dataset # name of type to create is monai.data.Dataset (loaded implicitly from MONAI)\n", + " data: '@datadicts' # argument data provided by a definition\n", + " transform: '$None' # argument transform provided by a Python expression\n", + "\n", + "test:\n", + "- '$print(\"Dataset\", @test_dataset)'\n", + "- '$print(\"Size\", len(@test_dataset))'\n", + "- '$print(\"Transform member\", @test_dataset.transform)'\n", + "- '$print(\"Values\", list(@test_dataset))'" + ] + }, + { + "cell_type": "code", + "execution_count": 31, + "id": "2cd1287c-f287-4831-bfc7-4cbdc394a3a1", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2023-07-14 15:28:36,063 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-07-14 15:28:36,063 - INFO - > run_id: 'test'\n", + "2023-07-14 15:28:36,063 - INFO - > meta_file: './TestBundle/configs/metadata.json'\n", + "2023-07-14 15:28:36,063 - INFO - > config_file: './TestBundle/configs/test_object.yaml'\n", + "2023-07-14 15:28:36,063 - INFO - ---\n", + "\n", + "\n", + "Dataset \n", + "Size 10\n", + "Transform member None\n", + "Values [{0: 0}, {1: 1}, {2: 4}, {3: 9}, {4: 16}, {5: 25}, {6: 36}, {7: 49}, {8: 64}, {9: 81}]\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "BUNDLE=\"./TestBundle\"\n", + "\n", + "# prints normal values\n", + "python -W ignore -m monai.bundle run test \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"$BUNDLE/configs/test_object.yaml\"" + ] + }, + { + "cell_type": "markdown", + "id": "6326d601-23f0-444b-821c-9596bd8c8296", + "metadata": {}, + "source": [ + "The `test_dataset` definition is roughly equivalent to the expression `Dataset(data=datadicts, transform=None)`. Like regular Python we don't need to provide values for arguments having defaults, but we can only give argument values by name and not by position. " + ] + }, + { + "cell_type": "markdown", + "id": "93c091b0-6140-4539-bb1e-36bf78445365", + "metadata": {}, + "source": [ + "## Command Line Definitions\n", + "\n", + "Command line arguments can be provided to add or modify definitions in the script you're running. Using `--` before the name of the variable allows you to set their value with the next argument, but this must be a valid Python expression. You can also set individual members of definitions with `#` but be sure to put quotes around the argument in Bash. \n", + "\n", + "We can demo this with an even simpler script:" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "391ec82b-43a2-4b6f-8307-e3c853986719", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing TestBundle/configs/test_cmdline.yaml\n" + ] + } + ], + "source": [ + "%%writefile TestBundle/configs/test_cmdline.yaml\n", + "\n", + "shape: [8, 8]\n", + "area: '$@shape[0]*@shape[1]'\n", + "\n", + "test:\n", + "- '$print(\"Height\", @shape[0])'\n", + "- '$print(\"Width\", @shape[1])'\n", + "- '$print(\"Area\", @area)'" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "229617a0-1120-4054-9232-1991cfa21ae9", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2023-07-14 15:22:37,435 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-07-14 15:22:37,435 - INFO - > run_id: 'test'\n", + "2023-07-14 15:22:37,436 - INFO - > meta_file: './TestBundle/configs/metadata.json'\n", + "2023-07-14 15:22:37,436 - INFO - > config_file: './TestBundle/configs/test_cmdline.yaml'\n", + "2023-07-14 15:22:37,436 - INFO - ---\n", + "\n", + "\n", + "Height 8\n", + "Width 8\n", + "Area 64\n", + "2023-07-14 15:22:40,876 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-07-14 15:22:40,876 - INFO - > run_id: 'test'\n", + "2023-07-14 15:22:40,876 - INFO - > meta_file: './TestBundle/configs/metadata.json'\n", + "2023-07-14 15:22:40,876 - INFO - > config_file: './TestBundle/configs/test_cmdline.yaml'\n", + "2023-07-14 15:22:40,876 - INFO - > shape#0: 4\n", + "2023-07-14 15:22:40,876 - INFO - ---\n", + "\n", + "\n", + "Height 4\n", + "Width 8\n", + "Area 32\n", + "2023-07-14 15:22:44,279 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-07-14 15:22:44,279 - INFO - > run_id: 'test'\n", + "2023-07-14 15:22:44,279 - INFO - > meta_file: './TestBundle/configs/metadata.json'\n", + "2023-07-14 15:22:44,279 - INFO - > config_file: './TestBundle/configs/test_cmdline.yaml'\n", + "2023-07-14 15:22:44,279 - INFO - > area: 32\n", + "2023-07-14 15:22:44,279 - INFO - ---\n", + "\n", + "\n", + "Height 8\n", + "Width 8\n", + "Area 32\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "BUNDLE=\"./TestBundle\"\n", + "\n", + "# prints normal values\n", + "python -W ignore -m monai.bundle run test \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"$BUNDLE/configs/test_cmdline.yaml\"\n", + "\n", + "# half the height\n", + "python -W ignore -m monai.bundle run test \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"$BUNDLE/configs/test_cmdline.yaml\" \\\n", + " '--shape#0' 4\n", + "\n", + "# area definition replaces existing expression with a lie\n", + "python -W ignore -m monai.bundle run test \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"$BUNDLE/configs/test_cmdline.yaml\" \\\n", + " --area 32" + ] + }, + { + "cell_type": "markdown", + "id": "87683aa7-0322-48cb-9919-f3b3b2546763", + "metadata": {}, + "source": [ + "## Multiple Files\n", + "\n", + "Multiple config files can be specified which will create a final script composed of definitions in the first file added to or updated with those in subsequent files. Remember that the files are essentially creating Python dictionaries of definitions that are interpreted later, so later files are just updating that dictionary when loaded. Definitions in one file can be referenced in others:" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "55c034c5-b03f-4ac1-8aa0-a7b768bbbb7e", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing TestBundle/configs/multifile1.yaml\n" + ] + } + ], + "source": [ + "%%writefile TestBundle/configs/multifile1.yaml\n", + "\n", + "width: 8\n", + "height: 8" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "2511798a-cd44-4aec-954c-c766b29f0a43", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing TestBundle/configs/multifile2.yaml\n" + ] + } + ], + "source": [ + "%%writefile TestBundle/configs/multifile2.yaml\n", + "\n", + "area: '$@width*@height'\n", + "\n", + "test:\n", + "- '$print(\"Area\", @area)'" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "dc6adf63-c4b5-4f97-805a-2321dc1e8d2c", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2023-07-14 15:09:59,663 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-07-14 15:09:59,663 - INFO - > run_id: 'test'\n", + "2023-07-14 15:09:59,663 - INFO - > meta_file: './TestBundle/configs/metadata.json'\n", + "2023-07-14 15:09:59,663 - INFO - > config_file: ['./TestBundle/configs/multifile1.yaml', './TestBundle/configs/multifile2.yaml']\n", + "2023-07-14 15:09:59,663 - INFO - ---\n", + "\n", + "\n", + "Area 64\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "BUNDLE=\"./TestBundle\"\n", + "\n", + "# area definition replaces existing expression with a lie\n", + "python -W ignore -m monai.bundle run test \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"['$BUNDLE/configs/multifile1.yaml','$BUNDLE/configs/multifile2.yaml']\"" + ] + }, + { + "cell_type": "markdown", + "id": "1afcbac7-1e65-4078-8465-24d5c8e08102", + "metadata": {}, + "source": [ + "The value for `config_file` in this example is a Python list containing 2 strings. It takes a bit of care to get the Bash syntax right so that this expression isn't mangled (eg. avoid spaces to prevent tokenisation and use \"\" quotes so that other quotes aren't interpreted), but is otherwise a simple mechanism.\n", + "\n", + "This mechanism, and the ability to add/modify definitions on the command line, is important for a number of reasons:\n", + "\n", + "* It lets you write a \"common\" configuration file containing definitions to be used with other config files and so reduce duplication.\n", + "* It lets different expressions or setups to be defined with different combinations of files, again avoiding having to duplicate then modify scripts for different experiments.\n", + "* Adding/changing definitions also allows quick minor changes or batching of different operations on the command line or in shell scripts, eg. doing a parameter sweep by looping through possible values and passing them as arguments. \n", + "\n", + "## Summary and Next\n", + "\n", + "We have here described the basics of bundles:\n", + "\n", + "* Directory structure\n", + "* Metadata file\n", + "* Configuration files\n", + "* Command line usage\n", + "\n", + "In the next tutorial we will actually create a bundle for a real network that does something and demonstrate features for working with networks." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python [conda env:monai]", + "language": "python", + "name": "conda-env-monai-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.10" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/bundle/02_mednist_classification.ipynb b/bundle/02_mednist_classification.ipynb new file mode 100644 index 0000000000..3dfea41f38 --- /dev/null +++ b/bundle/02_mednist_classification.ipynb @@ -0,0 +1,697 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "5e8ae3d7-3e2e-4755-a0b6-709ef4180719", + "metadata": {}, + "source": [ + "Copyright (c) MONAI Consortium \n", + "Licensed under the Apache License, Version 2.0 (the \"License\"); \n", + "you may not use this file except in compliance with the License. \n", + "You may obtain a copy of the License at \n", + "    http://www.apache.org/licenses/LICENSE-2.0 \n", + "Unless required by applicable law or agreed to in writing, software \n", + "distributed under the License is distributed on an \"AS IS\" BASIS, \n", + "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. \n", + "See the License for the specific language governing permissions and \n", + "limitations under the License." + ] + }, + { + "cell_type": "markdown", + "id": "191c5d77-8ae5-49ab-be22-45f5ba41641f", + "metadata": {}, + "source": [ + "## Setup environment" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "886952c4-0be4-459d-9c53-b81b29199c76", + "metadata": {}, + "outputs": [], + "source": [ + "!python -c \"import monai\" || pip install -q \"monai-weekly[ignite,pyyaml]\"" + ] + }, + { + "cell_type": "markdown", + "id": "a20e1274-0a27-4e37-95d7-fb813243c34c", + "metadata": {}, + "source": [ + "## Setup imports" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "b1144d87-ec2f-4b9b-907a-16ea2da279c4", + "metadata": {}, + "outputs": [], + "source": [ + "from monai.config import print_config\n", + "\n", + "print_config()" + ] + }, + { + "cell_type": "markdown", + "id": "c572d8b6-3dca-4487-80ad-928090b3e8ab", + "metadata": {}, + "source": [ + "# MedNIST Classification Bundle\n", + "\n", + "In this tutorial we'll describe how to create a bundle for a classification network. This will include how to train and apply the network on the command line. MedNIST will be used as the dataset with the bundle based off the [MONAI 101 notebook](https://github.com/Project-MONAI/tutorials/blob/main/2d_classification/monai_101.ipynb).\n", + "\n", + "The dataset is kindly made available by Dr. Bradley J. Erickson M.D., Ph.D. (Department of Radiology, Mayo Clinic) under the Creative Commons CC BY-SA 4.0 license. If you use the MedNIST dataset, please acknowledge the source of the MedNIST dataset: the repository https://github.com/Project-MONAI/MONAI/ or the MedNIST tutorial for image classification https://github.com/Project-MONAI/MONAI/blob/master/examples/notebooks/mednist_tutorial.ipynb.\n", + "\n", + "This work is licensed under the Creative Commons Attribution-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/4.0/.\n", + "\n", + "First we'll consider a condensed version of the code from that notebook and go step-by-step how best to represent this as a bundle:\n", + "\n", + "```python\n", + "import os\n", + "\n", + "import monai.transforms as mt\n", + "import torch\n", + "from monai.apps import MedNISTDataset\n", + "from monai.data import DataLoader\n", + "from monai.engines import SupervisedTrainer\n", + "from monai.inferers import SimpleInferer\n", + "from monai.networks import eval_mode\n", + "from monai.networks.nets import densenet121\n", + "\n", + "root_dir = os.environ.get(\"ROOTDIR\", \".\")\n", + "\n", + "max_epochs = 25\n", + "device = torch.device(\"cuda:0\")\n", + "net = densenet121(spatial_dims=2, in_channels=1, out_channels=6).to(device)\n", + "\n", + "transform = mt.Compose([\n", + " mt.LoadImaged(keys=\"image\", image_only=True),\n", + " mt.EnsureChannelFirstd(keys=\"image\"),\n", + " mt.ScaleIntensityd(keys=\"image\"),\n", + "])\n", + "\n", + "dataset = MedNISTDataset(root_dir=root_dir, transform=transform, section=\"training\", download=True)\n", + "\n", + "train_dl = DataLoader(dataset, batch_size=512, shuffle=True, num_workers=4)\n", + "\n", + "trainer = SupervisedTrainer(\n", + " device=device,\n", + " max_epochs=max_epochs,\n", + " train_data_loader=train_dl,\n", + " network=net,\n", + " optimizer=torch.optim.Adam(net.parameters(), lr=1e-5),\n", + " loss_function=torch.nn.CrossEntropyLoss(),\n", + " inferer=SimpleInferer(),\n", + ")\n", + "\n", + "trainer.run()\n", + "\n", + "torch.jit.script(net).save(\"mednist.ts\")\n", + "\n", + "class_names = (\"AbdomenCT\", \"BreastMRI\", \"CXR\", \"ChestCT\", \"Hand\", \"HeadCT\")\n", + "testdata = MedNISTDataset(root_dir=root_dir, transform=transform, section=\"test\", runtime_cache=True)\n", + "\n", + "max_items_to_print = 10\n", + "eval_dl = DataLoader(testdata[:max_items_to_print], batch_size=1, num_workers=0)\n", + "with eval_mode(net):\n", + " for item in eval_dl:\n", + " result = net(item[\"image\"].to(device))\n", + " prob = result.detach().to(\"cpu\")[0]\n", + " pred = class_names[prob.argmax()]\n", + " gt = item[\"class_name\"][0]\n", + " print(f\"Prediction: {pred}. Ground-truth: {gt}\")\n", + "```" + ] + }, + { + "cell_type": "markdown", + "id": "1a18d5cd-6338-4b41-87fd-4e119723bfee", + "metadata": {}, + "source": [ + "You can run this cell or save it to a file and run it on the command line. A `DenseNet` based network will be trained to classify MedNIST images into one of six categories. Mostly this script uses Ignite-based classes such as `SupervisedTrainer` which is great for converting into a bundle. Let's start by initialising a bundle directory structure:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "eb9dc6ec-13da-4a37-8afa-28e2766b9343", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "/usr/bin/tree\n", + "\u001b[01;34mMedNISTClassifier\u001b[00m\n", + "├── \u001b[01;34mconfigs\u001b[00m\n", + "│   ├── inference.json\n", + "│   └── metadata.json\n", + "├── \u001b[01;34mdocs\u001b[00m\n", + "│   └── README.md\n", + "├── LICENSE\n", + "└── \u001b[01;34mmodels\u001b[00m\n", + "\n", + "3 directories, 4 files\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "python -m monai.bundle init_bundle MedNISTClassifier\n", + "which tree && tree MedNISTClassifier || true" + ] + }, + { + "cell_type": "markdown", + "id": "5888c9bd-5022-40b5-9dec-84d9f737f868", + "metadata": {}, + "source": [ + "## Metadata\n", + "\n", + "We'll first replace the `metadata.json` file with our description of what the network will do:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "b29f053b-cf16-4ffc-bbe7-d9433fdfa872", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Overwriting MedNISTClassifier/configs/metadata.json\n" + ] + } + ], + "source": [ + "%%writefile MedNISTClassifier/configs/metadata.json\n", + "\n", + "{\n", + " \"version\": \"0.0.1\",\n", + " \"changelog\": {\n", + " \"0.0.1\": \"Initial version\"\n", + " },\n", + " \"monai_version\": \"1.2.0\",\n", + " \"pytorch_version\": \"2.0.0\",\n", + " \"numpy_version\": \"1.23.5\",\n", + " \"optional_packages_version\": {},\n", + " \"name\": \"MedNISTClassifier\",\n", + " \"task\": \"MedNIST Classification Network\",\n", + " \"description\": \"This is a demo network for classifying MedNIST images by type/modality\",\n", + " \"authors\": \"Your Name Here\",\n", + " \"copyright\": \"Copyright (c) Your Name Here\",\n", + " \"data_source\": \"MedNIST dataset kindly made available by Dr. Bradley J. Erickson M.D., Ph.D. (Department of Radiology, Mayo Clinic)\",\n", + " \"data_type\": \"jpeg\",\n", + " \"intended_use\": \"This is suitable for demonstration only\",\n", + " \"network_data_format\": {\n", + " \"inputs\": {\n", + " \"image\": {\n", + " \"type\": \"image\",\n", + " \"format\": \"magnitude\",\n", + " \"modality\": \"any\",\n", + " \"num_channels\": 1,\n", + " \"spatial_shape\": [64, 64],\n", + " \"dtype\": \"float32\",\n", + " \"value_range\": [0, 1],\n", + " \"is_patch_data\": false,\n", + " \"channel_def\": {\n", + " \"0\": \"image\"\n", + " }\n", + " }\n", + " },\n", + " \"outputs\": {\n", + " \"pred\": {\n", + " \"type\": \"probabilities\",\n", + " \"format\": \"classes\",\n", + " \"num_channels\": 6,\n", + " \"spatial_shape\": [6],\n", + " \"dtype\": \"float32\",\n", + " \"value_range\": [0, 1],\n", + " \"is_patch_data\": false,\n", + " \"channel_def\": {\n", + " \"0\": \"AbdomenCT\",\n", + " \"1\": \"BreastMRI\",\n", + " \"2\": \"CXR\",\n", + " \"3\": \"ChestCT\",\n", + " \"4\": \"Hand\",\n", + " \"5\": \"HeadCT\"\n", + " }\n", + " }\n", + " }\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "3f208bf8-0c3a-4def-ab0f-6091cebcd532", + "metadata": {}, + "source": [ + "This contains more information compared to the previous tutorial's file. For inputs the network, a tensor \"image\" is given as a 64x64 sized single-channel image. This is one of the MedNIST images whose modality varies but will have a value range of `[0, 1]` after rescaling in the transform pipeline. The channel definition states the meaning of each channel, this input has only one which is the greyscale image itself. For network outputs there is only one, \"pred\", representing the prediction of the network as a tensor of size 6. Each of the six values is a prediction of that class which is described in `channel_def`.\n", + "\n", + "## Common Definitions\n", + "\n", + "What we'll now do is construct the bundle configuration scripts to implement training, testing, and inference based off the original script file given above. Common definitions should be placed in a common file used with other scripts to reduce duplication. In our original script, the network definition and transform sequence will be used in multiple places so should go in this common file:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "d11681af-3210-4b2b-b7bd-8ad8dedfe230", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing MedNISTClassifier/configs/common.yaml\n" + ] + } + ], + "source": [ + "%%writefile MedNISTClassifier/configs/common.yaml\n", + "# only need to import torch right now\n", + "imports: \n", + "- $import torch\n", + "\n", + "# define a default root directory value, this can overridden on the command line\n", + "root_dir: \".\"\n", + "\n", + "# define a device for the network\n", + "device: '$torch.device(''cuda:0'')'\n", + "\n", + "# store the class names for inference later\n", + "class_names: [AbdomenCT, BreastMRI, CXR, ChestCT, Hand, HeadCT]\n", + "\n", + "# define the network separately, don't need to refer to MONAI types by name or import MONAI\n", + "network_def:\n", + " _target_: densenet121\n", + " spatial_dims: 2\n", + " in_channels: 1\n", + " out_channels: 6\n", + "\n", + "# define the network to be the given definition moved to the device\n", + "net: '$@network_def.to(@device)'\n", + "\n", + "# define a transform sequence by instantiating a Compose instance with a transform sequence\n", + "transform:\n", + " _target_: Compose\n", + " transforms:\n", + " - _target_: LoadImaged\n", + " keys: 'image'\n", + " image_only: true\n", + " - _target_: EnsureChannelFirstd\n", + " keys: 'image'\n", + " - _target_: ScaleIntensityd\n", + " keys: 'image'\n", + " " + ] + }, + { + "cell_type": "markdown", + "id": "eaf81ea7-9ea3-4548-a32e-992f0b9bc0ab", + "metadata": {}, + "source": [ + "Although this YAML is very different from the Python code it's defining essentially the same objects. Whether in YAML or JSON a bundle script defines an object instantiation as a dictionary containing the key `_target_` declaring the type to create, with other keys treated as arguments. A Python statement like `obj = ObjType(arg1=val1, arg2=val2)` is thus equivalent to \n", + "\n", + "```yaml\n", + "obj:\n", + " _target_: ObjType\n", + " arg1: val1\n", + " arg2: val2\n", + "```\n", + "\n", + "Note here that MONAI will import all its own symbols such that an explicit import statement is not needed nor is referring to types by fully qualified name, ie. `Compose` is adequate instead of `monai.transforms.Compose`. Definitions found in other packages or those in scripts associated with the bundle need to be referred to by the name they are imported as, eg. `torch.device` as show above.\n", + "\n", + "## Training\n", + "\n", + "For training we need a dataset, dataloader, and trainer object which will be used in the running \"program\":" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "4dfd052e-abe7-473a-bbf4-25674a3b20ea", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing MedNISTClassifier/configs/train.yaml\n" + ] + } + ], + "source": [ + "%%writefile MedNISTClassifier/configs/train.yaml\n", + "\n", + "max_epochs: 25\n", + "\n", + "dataset:\n", + " _target_: MedNISTDataset\n", + " root_dir: '@root_dir'\n", + " transform: '@transform'\n", + " section: training\n", + " download: true\n", + "\n", + "train_dl:\n", + " _target_: DataLoader\n", + " dataset: '@dataset'\n", + " batch_size: 512\n", + " shuffle: true\n", + " num_workers: 4\n", + "\n", + "trainer:\n", + " _target_: SupervisedTrainer\n", + " device: '@device'\n", + " max_epochs: '@max_epochs'\n", + " train_data_loader: '@train_dl'\n", + " network: '@net'\n", + " optimizer: \n", + " _target_: torch.optim.Adam\n", + " params: '$@net.parameters()'\n", + " lr: 0.00001 # learning rate set slow so that you can see network improvement over epochs\n", + " loss_function: \n", + " _target_: torch.nn.CrossEntropyLoss\n", + " inferer: \n", + " _target_: SimpleInferer\n", + "\n", + "train:\n", + "- '$@trainer.run()'\n", + "- '$torch.jit.script(@net).save(''model.ts'')'" + ] + }, + { + "cell_type": "markdown", + "id": "de752181-80b1-4221-9e4a-315e5f7f22a6", + "metadata": {}, + "source": [ + "There is a lot going on here but hopefully you see how this replicates the object definitions in the original source file. A few specific points:\n", + "* References are made to objects defined in `common.yaml` such as `@root_dir`, so this file needs to be used in conjunction with this one.\n", + "* A `max_epochs` hyperparameter is provided whose value you can change on the command line, eg. `--max_epochs 5`.\n", + "* Definitions for the `optimizer`, `loss_function`, and `inferer` arguments of `trainer` are provided inline but it would be better practice to define these separately.\n", + "* The learning rate is hard-coded as `1e-5`, it would again be better practice to define a separate `lr` hyperparameter, although it can be changed on the command line with `'--trainer#optimizer#lr' 0.001`.\n", + "* The trained network is saved using Pytorch's `jit` module directly, better practice would be to provide a handler, such as `CheckpointSaver`, to the trainer or to an evaluator object, see other tutorial examples on how to do this. This was kept here to match the original example.\n", + "\n", + "Now the network can be trained by running the bundle:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "8357670d-fe69-4789-9b9a-77c0d8144b10", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "workflow_name None\n", + "config_file ['./MedNISTClassifier/configs/common.yaml', './MedNISTClassifier/configs/train.yaml']\n", + "meta_file ./MedNISTClassifier/configs/metadata.json\n", + "logging_file None\n", + "init_id None\n", + "run_id train\n", + "final_id None\n", + "tracking None\n", + "max_epochs 2\n", + "2023-09-11 16:19:49,915 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-09-11 16:19:49,915 - INFO - > config_file: ['./MedNISTClassifier/configs/common.yaml',\n", + " './MedNISTClassifier/configs/train.yaml']\n", + "2023-09-11 16:19:49,915 - INFO - > meta_file: './MedNISTClassifier/configs/metadata.json'\n", + "2023-09-11 16:19:49,915 - INFO - > run_id: 'train'\n", + "2023-09-11 16:19:49,915 - INFO - > max_epochs: 2\n", + "2023-09-11 16:19:49,915 - INFO - ---\n", + "\n", + "\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/home/localek10/workspace/monai/MONAI_mine/monai/bundle/workflows.py:257: UserWarning: Default logging file in MedNISTClassifier/configs/logging.conf does not exist, skipping logging.\n", + " warnings.warn(f\"Default logging file in {logging_file} does not exist, skipping logging.\")\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2023-09-11 16:19:50,055 - INFO - Verified 'MedNIST.tar.gz', md5: 0bc7306e7427e00ad1c5526a6677552d.\n", + "2023-09-11 16:19:50,055 - INFO - File exists: MedNIST.tar.gz, skipped downloading.\n", + "2023-09-11 16:19:50,055 - INFO - Non-empty folder exists in MedNIST, skipped extracting.\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Loading dataset: 100%|██████████| 47164/47164 [00:41<00:00, 1145.05it/s]\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "BUNDLE=\"./MedNISTClassifier\"\n", + "\n", + "# run the bundle with epochs set to 2 for speed during testing, change this to get a better result\n", + "python -m monai.bundle run train \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"['$BUNDLE/configs/common.yaml','$BUNDLE/configs/train.yaml']\" \\\n", + " --max_epochs 2\n", + "\n", + "# we'll use the trained network as the model object for this bundle\n", + "mv model.ts $BUNDLE/models/model.ts\n", + "\n", + "# generate the saved dictionary file as well\n", + "cd \"$BUNDLE/models\"\n", + "python -c 'import torch; obj = torch.jit.load(\"model.ts\"); torch.save(obj.state_dict(), \"model.pt\")'" + ] + }, + { + "cell_type": "markdown", + "id": "bbf58fac-b6d5-424d-9e98-1a30937f2116", + "metadata": {}, + "source": [ + "As shown here the Torchscript object produced by the training is moved into the `models` directory of the bundle. The saved weight file is also produced by loading that file again and saving the state. Once again best practice would be to instead use `CheckpointSaver` to save weights in an output location before the final file is chosen for the bundle. \n", + "\n", + "## Evaluation\n", + "\n", + "To replicate the original example's code we'll need to put the evaluation loop code into a separate function and call it. The best practice would be to use an `Evaluator` class to do this with metric classes for assessing performance. Instead we'll stick close to the original code and demonstrate how to integrate your own code into a bundle.\n", + "\n", + "The first thing to do is put the evaluation loop into a function and store it in the `scripts` module within the bundle:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "fbad1a21-4dda-4b80-8e81-7d7e75307f9c", + "metadata": {}, + "outputs": [], + "source": [ + "!mkdir MedNISTClassifier/scripts" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "0c8725f7-f1cd-48f5-81a5-3f5a9ee03e9c", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing MedNISTClassifier/scripts/__init__.py\n" + ] + } + ], + "source": [ + "%%writefile MedNISTClassifier/scripts/__init__.py\n", + "\n", + "from monai.networks.utils import eval_mode\n", + "\n", + "def evaluate(net, dataloader, class_names, device):\n", + " with eval_mode(net):\n", + " for item in dataloader:\n", + " result = net(item[\"image\"].to(device))\n", + " prob = result.detach().to(\"cpu\")[0]\n", + " pred = class_names[prob.argmax()]\n", + " gt = item[\"class_name\"][0]\n", + " print(f\"Prediction: {pred}. Ground-truth: {gt}\")\n", + " " + ] + }, + { + "cell_type": "markdown", + "id": "abf40c4f-3349-4c40-9eef-811388ffd704", + "metadata": {}, + "source": [ + "The `scripts` directory has to be a valid Python module so needs a `__init__.py` file, you can include other files and import them separately or import their members into this file. Here we defined `evaluate` to enclose the loop from the original script. This can then be called as part of a expression sequence \"program\":" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "b4e1f99a-a68b-4aeb-bcf2-842f26609b52", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing MedNISTClassifier/configs/evaluate.yaml\n" + ] + } + ], + "source": [ + "%%writefile MedNISTClassifier/configs/evaluate.yaml\n", + "\n", + "imports: \n", + "- $import scripts\n", + "\n", + "max_items_to_print: 10\n", + "\n", + "ckpt_file: \"\"\n", + "\n", + "testdata:\n", + " _target_: MedNISTDataset\n", + " root_dir: '@root_dir'\n", + " transform: '@transform'\n", + " section: test\n", + " download: false\n", + " runtime_cache: true\n", + "\n", + "eval_dl:\n", + " _target_: DataLoader\n", + " dataset: '$@testdata[:@max_items_to_print]'\n", + " batch_size: 1\n", + " num_workers: 0\n", + "\n", + "# loads the weights from the given file (which needs to be set on the command line) then calls \"evaluate\"\n", + "evaluate:\n", + "- '$@net.load_state_dict(torch.load(@ckpt_file))'\n", + "- '$scripts.evaluate(@net, @eval_dl, @class_names, @device)'\n" + ] + }, + { + "cell_type": "markdown", + "id": "64bb2286-3107-49e9-8dbe-66fe1a2ae08c", + "metadata": {}, + "source": [ + "Evaluation is then run on the command line, using \"evaluate\" as the program to run and providing a path to the model weights with the `ckpt_file` variable:" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "3c5fa39f-8798-4e41-8e2a-3a70a6be3906", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "workflow_name None\n", + "config_file ['./MedNISTClassifier/configs/common.yaml', './MedNISTClassifier/configs/evaluate.yaml']\n", + "meta_file ./MedNISTClassifier/configs/metadata.json\n", + "logging_file None\n", + "init_id None\n", + "run_id evaluate\n", + "final_id None\n", + "tracking None\n", + "ckpt_file ./MedNISTClassifier/models/model.pt\n", + "2023-09-11 16:22:56,379 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-09-11 16:22:56,379 - INFO - > config_file: ['./MedNISTClassifier/configs/common.yaml',\n", + " './MedNISTClassifier/configs/evaluate.yaml']\n", + "2023-09-11 16:22:56,379 - INFO - > meta_file: './MedNISTClassifier/configs/metadata.json'\n", + "2023-09-11 16:22:56,379 - INFO - > run_id: 'evaluate'\n", + "2023-09-11 16:22:56,379 - INFO - > ckpt_file: './MedNISTClassifier/models/model.pt'\n", + "2023-09-11 16:22:56,379 - INFO - ---\n", + "\n", + "\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/home/localek10/workspace/monai/MONAI_mine/monai/bundle/workflows.py:257: UserWarning: Default logging file in MedNISTClassifier/configs/logging.conf does not exist, skipping logging.\n", + " warnings.warn(f\"Default logging file in {logging_file} does not exist, skipping logging.\")\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Prediction: AbdomenCT. Ground-truth: AbdomenCT\n", + "Prediction: BreastMRI. Ground-truth: BreastMRI\n", + "Prediction: ChestCT. Ground-truth: ChestCT\n", + "Prediction: CXR. Ground-truth: CXR\n", + "Prediction: Hand. Ground-truth: Hand\n", + "Prediction: HeadCT. Ground-truth: HeadCT\n", + "Prediction: HeadCT. Ground-truth: HeadCT\n", + "Prediction: CXR. Ground-truth: CXR\n", + "Prediction: ChestCT. Ground-truth: ChestCT\n", + "Prediction: BreastMRI. Ground-truth: BreastMRI\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "BUNDLE=\"./MedNISTClassifier\"\n", + "export PYTHONPATH=\"$BUNDLE\"\n", + "\n", + "python -m monai.bundle run evaluate \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"['$BUNDLE/configs/common.yaml','$BUNDLE/configs/evaluate.yaml']\" \\\n", + " --ckpt_file \"$BUNDLE/models/model.pt\"" + ] + }, + { + "cell_type": "markdown", + "id": "6fd62905-4ea8-4f08-bcea-823074fc4ce4", + "metadata": {}, + "source": [ + "## Summary and Next\n", + "\n", + "This tutorial has covered:\n", + "* Creating full training scripts in bundles\n", + "* Training a network then evaluating it's performance with scripts\n", + "\n", + "That's it to creating a bundle to match an existing script. It was mentioned in a number of places that best practice wasn't followed to stick to the original script's structure, so further tutorials will cover this in greater detail. " + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python [conda env:monai]", + "language": "python", + "name": "conda-env-monai-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.10" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/bundle/03_mednist_classification_v2.ipynb b/bundle/03_mednist_classification_v2.ipynb new file mode 100644 index 0000000000..6aff84cc80 --- /dev/null +++ b/bundle/03_mednist_classification_v2.ipynb @@ -0,0 +1,976 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "9b2bc6c7-f54c-436f-ab66-86a631fb75d8", + "metadata": {}, + "source": [ + "Copyright (c) MONAI Consortium \n", + "Licensed under the Apache License, Version 2.0 (the \"License\"); \n", + "you may not use this file except in compliance with the License. \n", + "You may obtain a copy of the License at \n", + "    http://www.apache.org/licenses/LICENSE-2.0 \n", + "Unless required by applicable law or agreed to in writing, software \n", + "distributed under the License is distributed on an \"AS IS\" BASIS, \n", + "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. \n", + "See the License for the specific language governing permissions and \n", + "limitations under the License." + ] + }, + { + "cell_type": "markdown", + "id": "ddfe7d95-3567-4cb2-9eb5-65f235113768", + "metadata": {}, + "source": [ + "## Setup environment" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "fab1bcae-678b-4b19-a513-d0577d3d7e2b", + "metadata": {}, + "outputs": [], + "source": [ + "!python -c \"import monai\" || pip install -q \"monai-weekly[ignite,pyyaml]\"" + ] + }, + { + "cell_type": "markdown", + "id": "c8ae8b11-f5cf-4f91-ac60-8660f2ab2a4d", + "metadata": {}, + "source": [ + "## Setup imports" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "f1492c89-b19f-4216-b3a0-9960397e72ca", + "metadata": {}, + "outputs": [], + "source": [ + "import numpy as np\n", + "from monai.apps import MedNISTDataset\n", + "from monai.config import print_config\n", + "\n", + "print_config()" + ] + }, + { + "cell_type": "markdown", + "id": "2682936a-09ed-4703-af06-c59f755395ee", + "metadata": {}, + "source": [ + "# MedNIST Classification Bundle\n", + "\n", + "In this tutorial we'll revisit the bundle replicating [MONAI 101 notebook](https://github.com/Project-MONAI/tutorials/blob/main/2d_classification/monai_101.ipynb) and add more features representing best practice concepts. This will include evaluation and checkpoint saving techniques.\n", + "\n", + "We'll first create a bundle very much like in the previous tutorial with the same metadata and common script file:" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "eb9dc6ec-13da-4a37-8afa-28e2766b9343", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "/usr/bin/tree\n", + "\u001b[01;34mMedNISTClassifier_v2\u001b[00m\n", + "├── \u001b[01;34mconfigs\u001b[00m\n", + "│   ├── inference.json\n", + "│   └── metadata.json\n", + "├── \u001b[01;34mdocs\u001b[00m\n", + "│   └── README.md\n", + "├── LICENSE\n", + "└── \u001b[01;34mmodels\u001b[00m\n", + "\n", + "3 directories, 4 files\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "python -m monai.bundle init_bundle MedNISTClassifier_v2\n", + "which tree && tree MedNISTClassifier_v2 || true" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "b29f053b-cf16-4ffc-bbe7-d9433fdfa872", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Overwriting MedNISTClassifier_v2/configs/metadata.json\n" + ] + } + ], + "source": [ + "%%writefile MedNISTClassifier_v2/configs/metadata.json\n", + "\n", + "{\n", + " \"version\": \"0.0.1\",\n", + " \"changelog\": {\n", + " \"0.0.1\": \"Initial version\"\n", + " },\n", + " \"monai_version\": \"1.2.0\",\n", + " \"pytorch_version\": \"2.0.0\",\n", + " \"numpy_version\": \"1.23.5\",\n", + " \"optional_packages_version\": {},\n", + " \"name\": \"MedNISTClassifier\",\n", + " \"task\": \"MedNIST Classification Network\",\n", + " \"description\": \"This is a demo network for classifying MedNIST images by type/modality\",\n", + " \"authors\": \"Your Name Here\",\n", + " \"copyright\": \"Copyright (c) Your Name Here\",\n", + " \"data_source\": \"MedNIST dataset kindly made available by Dr. Bradley J. Erickson M.D., Ph.D. (Department of Radiology, Mayo Clinic)\",\n", + " \"data_type\": \"jpeg\",\n", + " \"intended_use\": \"This is suitable for demonstration only\",\n", + " \"network_data_format\": {\n", + " \"inputs\": {\n", + " \"image\": {\n", + " \"type\": \"image\",\n", + " \"format\": \"magnitude\",\n", + " \"modality\": \"any\",\n", + " \"num_channels\": 1,\n", + " \"spatial_shape\": [64, 64],\n", + " \"dtype\": \"float32\",\n", + " \"value_range\": [0, 1],\n", + " \"is_patch_data\": false,\n", + " \"channel_def\": {\n", + " \"0\": \"image\"\n", + " }\n", + " }\n", + " },\n", + " \"outputs\": {\n", + " \"pred\": {\n", + " \"type\": \"probabilities\",\n", + " \"format\": \"classes\",\n", + " \"num_channels\": 6,\n", + " \"spatial_shape\": [6],\n", + " \"dtype\": \"float32\",\n", + " \"value_range\": [0, 1],\n", + " \"is_patch_data\": false,\n", + " \"channel_def\": {\n", + " \"0\": \"AbdomenCT\",\n", + " \"1\": \"BreastMRI\",\n", + " \"2\": \"CXR\",\n", + " \"3\": \"ChestCT\",\n", + " \"4\": \"Hand\",\n", + " \"5\": \"HeadCT\"\n", + " }\n", + " }\n", + " }\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "04826c73-7c26-4c5e-8d2a-8968c3954b5a", + "metadata": {}, + "source": [ + "As you've likely seen in outputs, there should be a `logging.conf` file in the `configs` directory to set up the Python logger appropriately. This will improve the output we get in the notebook:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "0cb1b023-d192-4ad7-b2eb-c4a2c6b42b84", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing MedNISTClassifier_v2/configs/logging.conf\n" + ] + } + ], + "source": [ + "%%writefile MedNISTClassifier_v2/configs/logging.conf\n", + "\n", + "[loggers]\n", + "keys=root\n", + "\n", + "[handlers]\n", + "keys=consoleHandler\n", + "\n", + "[formatters]\n", + "keys=fullFormatter\n", + "\n", + "[logger_root]\n", + "level=INFO\n", + "handlers=consoleHandler\n", + "\n", + "[handler_consoleHandler]\n", + "class=StreamHandler\n", + "level=INFO\n", + "formatter=fullFormatter\n", + "args=(sys.stdout,)\n", + "\n", + "[formatter_fullFormatter]\n", + "format=%(asctime)s - %(name)s - %(levelname)s - %(message)s\n" + ] + }, + { + "cell_type": "markdown", + "id": "b306ff33-c39b-4822-b6d4-346987cfe87b", + "metadata": {}, + "source": [ + "We'll change the common file slightly by adding some extra symbols, specifically `bundle_root` which should always be present in bundles. We'll keep `root_dir` since it's used to determine where MedNIST is downloaded to." + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "d11681af-3210-4b2b-b7bd-8ad8dedfe230", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing MedNISTClassifier_v2/configs/common.yaml\n" + ] + } + ], + "source": [ + "%%writefile MedNISTClassifier_v2/configs/common.yaml\n", + "\n", + "# added a few more imports\n", + "imports: \n", + "- $import torch\n", + "- $import datetime\n", + "- $import os\n", + "\n", + "root_dir: .\n", + "\n", + "# use constants from MONAI instead of hard-coding names\n", + "image: $monai.utils.CommonKeys.IMAGE\n", + "label: $monai.utils.CommonKeys.LABEL\n", + "pred: $monai.utils.CommonKeys.PRED\n", + "\n", + "# these are added definitions\n", + "bundle_root: .\n", + "ckpt_path: $@bundle_root + '/models/model.pt'\n", + "\n", + "# define a device for the network\n", + "device: '$torch.device(''cuda:0'')'\n", + "\n", + "# store the class names for inference later\n", + "class_names: [AbdomenCT, BreastMRI, CXR, ChestCT, Hand, HeadCT]\n", + "\n", + "# define the network separately, don't need to refer to MONAI types by name or import MONAI\n", + "network_def:\n", + " _target_: densenet121\n", + " spatial_dims: 2\n", + " in_channels: 1\n", + " out_channels: 6\n", + "\n", + "# define the network to be the given definition moved to the device\n", + "net: '$@network_def.to(@device)'\n", + "\n", + "# define a transform sequence as a list of transform objects instead of using Compose here\n", + "train_transforms:\n", + "- _target_: LoadImaged\n", + " keys: '@image'\n", + " image_only: true\n", + "- _target_: EnsureChannelFirstd\n", + " keys: '@image'\n", + "- _target_: ScaleIntensityd\n", + " keys: '@image'\n", + " " + ] + }, + { + "cell_type": "markdown", + "id": "eaf81ea7-9ea3-4548-a32e-992f0b9bc0ab", + "metadata": {}, + "source": [ + "\n", + "## Training\n", + "\n", + "For training we have the same elements again but we'll add a `SupervisedEvaluator` object to track model progress with handlers to save checkpoints. " + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "4dfd052e-abe7-473a-bbf4-25674a3b20ea", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing MedNISTClassifier_v2/configs/train.yaml\n" + ] + } + ], + "source": [ + "%%writefile MedNISTClassifier_v2/configs/train.yaml\n", + "\n", + "max_epochs: 25\n", + "learning_rate: 0.00001 # learning rate, again artificially slow\n", + "val_interval: 1 # run validation every n'th epoch\n", + "save_interval: 1 # save the model weights every n'th epoch\n", + "\n", + "# choose a unique output subdirectory every time training is started, \n", + "output_dir: '$datetime.datetime.now().strftime(@root_dir+''/output/output_%y%m%d_%H%M%S'')'\n", + "\n", + "train_dataset:\n", + " _target_: MedNISTDataset\n", + " root_dir: '@root_dir'\n", + " transform: \n", + " _target_: Compose\n", + " transforms: '@train_transforms'\n", + " section: training\n", + " download: true\n", + "\n", + "train_dl:\n", + " _target_: DataLoader\n", + " dataset: '@train_dataset'\n", + " batch_size: 512\n", + " shuffle: true\n", + " num_workers: 4\n", + "\n", + "# separate dataset taking from the \"validation\" section\n", + "eval_dataset:\n", + " _target_: MedNISTDataset\n", + " root_dir: '@root_dir'\n", + " transform: \n", + " _target_: Compose\n", + " transforms: '$@train_transforms'\n", + " section: validation\n", + " download: true\n", + "\n", + "# separate dataloader for evaluation\n", + "eval_dl:\n", + " _target_: DataLoader\n", + " dataset: '@eval_dataset'\n", + " batch_size: 512\n", + " shuffle: false\n", + " num_workers: 4\n", + "\n", + "# transforms applied to network output, in this case applying activation, argmax, and one-hot-encoding\n", + "post_transform:\n", + " _target_: Compose\n", + " transforms:\n", + " - _target_: Activationsd\n", + " keys: '@pred'\n", + " softmax: true # apply softmax to the prediction to emphasize the most likely value\n", + " - _target_: AsDiscreted\n", + " keys: ['@label','@pred']\n", + " argmax: [false, true] # apply argmax to the prediction only to get a class index number\n", + " to_onehot: 6 # convert both prediction and label to one-hot format so that both have shape (6,)\n", + "\n", + "# separating out loss, inferer, and optimizer definitions\n", + "\n", + "loss_function:\n", + " _target_: torch.nn.CrossEntropyLoss\n", + "\n", + "inferer: \n", + " _target_: SimpleInferer\n", + "\n", + "optimizer: \n", + " _target_: torch.optim.Adam\n", + " params: '$@net.parameters()'\n", + " lr: '@learning_rate'\n", + "\n", + "# Handlers to load the checkpoint if present, run validation at the chosen interval, save the checkpoint\n", + "# at the chosen interval, log stats, and write the log to a file in the output directory.\n", + "handlers:\n", + "- _target_: CheckpointLoader\n", + " _disabled_: '$not os.path.exists(@ckpt_path)'\n", + " load_path: '@ckpt_path'\n", + " load_dict:\n", + " model: '@net'\n", + "- _target_: ValidationHandler\n", + " validator: '@evaluator'\n", + " epoch_level: true\n", + " interval: '@val_interval'\n", + "- _target_: CheckpointSaver\n", + " save_dir: '@output_dir'\n", + " save_dict:\n", + " model: '@net'\n", + " save_interval: '@save_interval'\n", + " save_final: true # save the final weights, either when the run finishes or is interrupted somehow\n", + "- _target_: StatsHandler\n", + " name: train_loss\n", + " tag_name: train_loss\n", + " output_transform: '$monai.handlers.from_engine([''loss''], first=True)' # print per-iteration loss\n", + "- _target_: LogfileHandler\n", + " output_dir: '@output_dir'\n", + "\n", + "trainer:\n", + " _target_: SupervisedTrainer\n", + " device: '@device'\n", + " max_epochs: '@max_epochs'\n", + " train_data_loader: '@train_dl'\n", + " network: '@net'\n", + " optimizer: '@optimizer'\n", + " loss_function: '@loss_function'\n", + " inferer: '@inferer'\n", + " train_handlers: '@handlers'\n", + "\n", + "# validation handlers which log stats and direct the log to a file\n", + "val_handlers:\n", + "- _target_: StatsHandler\n", + " name: val_stats\n", + " output_transform: '$lambda x: None'\n", + "- _target_: LogfileHandler\n", + " output_dir: '@output_dir'\n", + " \n", + "# Metrics to assess validation results, you can have more than one here but may \n", + "# need to adapt the format of pred and label.\n", + "metrics:\n", + " accuracy:\n", + " _target_: 'ignite.metrics.Accuracy'\n", + " output_transform: '$monai.handlers.from_engine([@pred, @label])'\n", + "\n", + "# runs the evaluation process, invoked by trainer via the ValidationHandler object\n", + "evaluator:\n", + " _target_: SupervisedEvaluator\n", + " device: '@device'\n", + " val_data_loader: '@eval_dl'\n", + " network: '@net'\n", + " inferer: '@inferer'\n", + " postprocessing: '@post_transform'\n", + " key_val_metric: '@metrics'\n", + " val_handlers: '@val_handlers'\n", + "\n", + "train:\n", + "- '$@trainer.run()'\n" + ] + }, + { + "cell_type": "markdown", + "id": "de752181-80b1-4221-9e4a-315e5f7f22a6", + "metadata": {}, + "source": [ + "We can now train as normal, specifying the logging config file and a maximum number of epochs you probably will want to set higher for a good result:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "8357670d-fe69-4789-9b9a-77c0d8144b10", + "metadata": {}, + "outputs": [], + "source": [ + "%%bash\n", + "\n", + "BUNDLE=\"./MedNISTClassifier_v2\"\n", + "\n", + "python -m monai.bundle run train \\\n", + " --bundle_root \"$BUNDLE\" \\\n", + " --logging_file \"$BUNDLE/configs/logging.conf\" \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"['$BUNDLE/configs/common.yaml','$BUNDLE/configs/train.yaml']\" \\\n", + " --max_epochs 2 &> out.txt || true" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3d7e7e11-db67-47e3-a03d-0955feee1636", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "raise Exception(open(\"out.txt\").read())" + ] + }, + { + "cell_type": "markdown", + "id": "627bf8a5-1524-425f-93f8-28e217f2adec", + "metadata": {}, + "source": [ + "Results and logs get put into unique timestamped directories:" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "id": "00c84e2c-1709-4136-8612-87142026ac2e", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "/usr/bin/tree\n", + "\u001b[01;34moutput/output_230911_164547\u001b[00m\n", + "├── log.txt\n", + "├── model_epoch=1.pt\n", + "├── model_epoch=2.pt\n", + "└── model_final_iteration=186.pt\n", + "\n", + "0 directories, 4 files\n" + ] + } + ], + "source": [ + "!which tree && tree output/* || true" + ] + }, + { + "cell_type": "markdown", + "id": "5705ff79-fe58-410a-bb93-80b4f3fa2ea2", + "metadata": {}, + "source": [ + "## Inference\n", + "\n", + "What is also needed is an inference script which will apply a loaded network to every image in a given directory and write a result to a file or to the log output. For segmentation networks this should save generated segmentations to know locations, but for this classification network we'll stick to just printing results to the log. \n", + "\n", + "First thing to do is create a test directory with only a few test images so we can demonstrate inference quickly:" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "id": "3a957503-39e4-4f73-a989-ce6e4e2d3e9e", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Loading dataset: 100%|██████████| 5895/5895 [00:03<00:00, 1671.21it/s]\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "MedNIST/AbdomenCT/001990.jpeg Label: 0\n", + "MedNIST/BreastMRI/007676.jpeg Label: 1\n", + "MedNIST/ChestCT/006763.jpeg Label: 3\n", + "MedNIST/CXR/001214.jpeg Label: 2\n", + "MedNIST/Hand/004427.jpeg Label: 4\n", + "MedNIST/HeadCT/003806.jpeg Label: 5\n", + "MedNIST/HeadCT/004638.jpeg Label: 5\n", + "MedNIST/CXR/005013.jpeg Label: 2\n", + "MedNIST/ChestCT/008275.jpeg Label: 3\n", + "MedNIST/BreastMRI/000630.jpeg Label: 1\n", + "MedNIST/BreastMRI/007547.jpeg Label: 1\n", + "MedNIST/BreastMRI/008425.jpeg Label: 1\n", + "MedNIST/AbdomenCT/003981.jpeg Label: 0\n", + "MedNIST/Hand/001130.jpeg Label: 4\n", + "MedNIST/BreastMRI/005118.jpeg Label: 1\n", + "MedNIST/CXR/006505.jpeg Label: 2\n", + "MedNIST/ChestCT/008218.jpeg Label: 3\n", + "MedNIST/HeadCT/005305.jpeg Label: 5\n", + "MedNIST/AbdomenCT/007871.jpeg Label: 0\n", + "MedNIST/Hand/007065.jpeg Label: 4\n" + ] + } + ], + "source": [ + "root_dir = \".\" # assuming MedNIST was downloaded to the current directory\n", + "num_images = 20\n", + "dataset = MedNISTDataset(root_dir=root_dir, section=\"test\", download=False)\n", + "\n", + "!mkdir -p test_images\n", + "\n", + "for i in range(num_images):\n", + " filename = dataset[i][\"image_meta_dict\"][\"filename_or_obj\"]\n", + " print(filename, \"Label:\", dataset[i][\"label\"])\n", + " !cp {root_dir}/{filename} test_images" + ] + }, + { + "cell_type": "markdown", + "id": "0044efdc-6c5e-479c-880b-acd9e7ab4fea", + "metadata": { + "tags": [] + }, + "source": [ + "Next remove the existing example inference script:" + ] + }, + { + "cell_type": "code", + "execution_count": 12, + "id": "7f800520-f29f-4b80-9af4-5e069f97824b", + "metadata": { + "tags": [] + }, + "outputs": [], + "source": [ + "!rm \"MedNISTClassifier_v2/configs/inference.json\"" + ] + }, + { + "cell_type": "markdown", + "id": "ef85014c-d1eb-4a93-911b-f405eac74094", + "metadata": {}, + "source": [ + "Next we'll create the inference script which will apply the network to all the files in the given directory (thus assuming all are images) and save the results to a csv file:" + ] + }, + { + "cell_type": "code", + "execution_count": 13, + "id": "3c5556db-2e63-484c-9358-977b4c35d60f", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing MedNISTClassifier_v2/configs/inference.yaml\n" + ] + } + ], + "source": [ + "%%writefile MedNISTClassifier_v2/configs/inference.yaml\n", + "\n", + "imports:\n", + "- $import glob\n", + "\n", + "input_dir: 'input'\n", + "# dataset is a list of dictionaries to work with dictionary transforms\n", + "input_files: '$[{@image: f} for f in sorted(glob.glob(@input_dir+''/*.*''))]'\n", + "\n", + "infer_dataset:\n", + " _target_: Dataset\n", + " data: '@input_files'\n", + " transform: \n", + " _target_: Compose\n", + " transforms: '@train_transforms'\n", + "\n", + "infer_dl:\n", + " _target_: DataLoader\n", + " dataset: '@infer_dataset'\n", + " batch_size: 1\n", + " shuffle: false\n", + " num_workers: 0\n", + "\n", + "# transforms applied to network output, same as those in training except \"label\" isn't present\n", + "post_transform:\n", + " _target_: Compose\n", + " transforms:\n", + " - _target_: Activationsd\n", + " keys: '@pred'\n", + " softmax: true \n", + " - _target_: AsDiscreted\n", + " keys: ['@pred']\n", + " argmax: true \n", + "\n", + "# handlers to load the checkpoint file (and fail if a file isn't found), and save classification results to a csv file\n", + "handlers:\n", + "- _target_: CheckpointLoader\n", + " load_path: '@ckpt_path'\n", + " load_dict:\n", + " model: '@net'\n", + "- _target_: ClassificationSaver\n", + " batch_transform: '$lambda batch: batch[0][@image].meta'\n", + " output_transform: '$monai.handlers.from_engine([''pred''])'\n", + "\n", + "inferer: \n", + " _target_: SimpleInferer\n", + "\n", + "evaluator:\n", + " _target_: SupervisedEvaluator\n", + " device: '@device'\n", + " val_data_loader: '@infer_dl'\n", + " network: '@net'\n", + " inferer: '@inferer'\n", + " postprocessing: '@post_transform'\n", + " val_handlers: '@handlers'\n", + "\n", + "inference:\n", + "- '$@evaluator.run()'" + ] + }, + { + "cell_type": "markdown", + "id": "5e9a706a-b135-4943-8245-0da8d5dad415", + "metadata": {}, + "source": [ + "Inference can now be run, specifying the checkpoint file to load as being one from our training run and the input directory as \"test_images\" which was created above:" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "acdcc111-f259-4701-8b1d-31fcf74398bc", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2023-09-11 16:54:49,564 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-09-11 16:54:49,564 - INFO - > run_id: 'inference'\n", + "2023-09-11 16:54:49,564 - INFO - > meta_file: './MedNISTClassifier_v2/configs/metadata.json'\n", + "2023-09-11 16:54:49,564 - INFO - > config_file: ['./MedNISTClassifier_v2/configs/common.yaml',\n", + " './MedNISTClassifier_v2/configs/inference.yaml']\n", + "2023-09-11 16:54:49,564 - INFO - > logging_file: './MedNISTClassifier_v2/configs/logging.conf'\n", + "2023-09-11 16:54:49,565 - INFO - > bundle_root: './MedNISTClassifier_v2'\n", + "2023-09-11 16:54:49,565 - INFO - > ckpt_path: 'output/output_230911_164547/model_final_iteration=186.pt'\n", + "2023-09-11 16:54:49,565 - INFO - > input_dir: 'test_images'\n", + "2023-09-11 16:54:49,565 - INFO - ---\n", + "\n", + "\n", + "2023-09-11 16:54:49,565 - INFO - Setting logging properties based on config: ./MedNISTClassifier_v2/configs/logging.conf.\n", + "2023-09-11 16:54:49,924 - ignite.engine.engine.SupervisedEvaluator - INFO - Engine run resuming from iteration 0, epoch 0 until 1 epochs\n", + "2023-09-11 16:54:50,035 - ignite.engine.engine.SupervisedEvaluator - INFO - Restored all variables from output/output_230911_164547/model_final_iteration=186.pt\n", + "2023-09-11 16:54:50,936 - ignite.engine.engine.SupervisedEvaluator - INFO - Epoch[1] Complete. Time taken: 00:00:00.901\n", + "2023-09-11 16:54:50,936 - ignite.engine.engine.SupervisedEvaluator - INFO - Engine run complete. Time taken: 00:00:01.012\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "BUNDLE=\"./MedNISTClassifier_v2\"\n", + "# need to capture name since it'll be different for you\n", + "ckpt=$(find output -name 'model_final_iteration=186.pt'|sort|tail -1)\n", + "\n", + "python -m monai.bundle run inference \\\n", + " --bundle_root \"$BUNDLE\" \\\n", + " --logging_file \"$BUNDLE/configs/logging.conf\" \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"['$BUNDLE/configs/common.yaml','$BUNDLE/configs/inference.yaml']\" \\\n", + " --ckpt_path \"$ckpt\" \\\n", + " --input_dir test_images " + ] + }, + { + "cell_type": "markdown", + "id": "955faa08-0552-4bff-ba84-238e9a404f62", + "metadata": {}, + "source": [ + "This will save the results of the inference to \"predictions.csv\" by default. You can change what the output filename is with an argument like `'--handlers#1#filename' pred.csv` which will directly change the `filename` parameter of the appropriate handler. Note the single quotes around the argument name since the hash sigil is interpreted by Bash as a comment otherwise.\n", + "\n", + "Looking at the output, the results aren't terribly legible:" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "id": "4a695039-7a53-4f9a-9754-769a9f8ebac8", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "test_images/000630.jpeg,1.0\n", + "test_images/001130.jpeg,4.0\n", + "test_images/001214.jpeg,2.0\n", + "test_images/001990.jpeg,0.0\n", + "test_images/003806.jpeg,5.0\n", + "test_images/003981.jpeg,0.0\n", + "test_images/004427.jpeg,4.0\n", + "test_images/004638.jpeg,5.0\n", + "test_images/005013.jpeg,2.0\n", + "test_images/005118.jpeg,1.0\n", + "test_images/005305.jpeg,5.0\n", + "test_images/006505.jpeg,2.0\n", + "test_images/006763.jpeg,3.0\n", + "test_images/007065.jpeg,4.0\n", + "test_images/007547.jpeg,1.0\n", + "test_images/007676.jpeg,1.0\n", + "test_images/007871.jpeg,0.0\n", + "test_images/008218.jpeg,3.0\n", + "test_images/008275.jpeg,3.0\n", + "test_images/008425.jpeg,1.0\n" + ] + } + ], + "source": [ + "!cat predictions.csv" + ] + }, + { + "cell_type": "markdown", + "id": "a231c937-9ced-4a6d-b01c-3bc9a128fd62", + "metadata": {}, + "source": [ + "The second column is the predicted class which we can use as an index into our list of class names to get something more readable:" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "id": "1065f928-3f66-47af-aed4-be2f0443cf2f", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "test_images/000630.jpeg BreastMRI\n", + "test_images/001130.jpeg Hand\n", + "test_images/001214.jpeg CXR\n", + "test_images/001990.jpeg AbdomenCT\n", + "test_images/003806.jpeg HeadCT\n", + "test_images/003981.jpeg AbdomenCT\n", + "test_images/004427.jpeg Hand\n", + "test_images/004638.jpeg HeadCT\n", + "test_images/005013.jpeg CXR\n", + "test_images/005118.jpeg BreastMRI\n", + "test_images/005305.jpeg HeadCT\n", + "test_images/006505.jpeg CXR\n", + "test_images/006763.jpeg ChestCT\n", + "test_images/007065.jpeg Hand\n", + "test_images/007547.jpeg BreastMRI\n", + "test_images/007676.jpeg BreastMRI\n", + "test_images/007871.jpeg AbdomenCT\n", + "test_images/008218.jpeg ChestCT\n", + "test_images/008275.jpeg ChestCT\n", + "test_images/008425.jpeg BreastMRI\n" + ] + } + ], + "source": [ + "class_names = [\"AbdomenCT\", \"BreastMRI\", \"CXR\", \"ChestCT\", \"Hand\", \"HeadCT\"]\n", + "\n", + "for fn, idx in np.loadtxt(\"predictions.csv\", delimiter=\",\", dtype=str):\n", + " print(fn, class_names[int(float(idx))])" + ] + }, + { + "cell_type": "markdown", + "id": "235e90b9-9209-4a58-885d-042ab55c9c18", + "metadata": {}, + "source": [ + "## Putting the Bundle Together\n", + "\n", + "We have a checkpoint for our network which produces good results that we can now make the \"official\" shared weights for the bundle. We need to copy the checkpoint into the `models` directory and optionally produce a Torchscript version of the network. \n", + "\n", + "For the Torchscript convertion MONAI provides the `ckpt_export` program in the bundles submodule:" + ] + }, + { + "cell_type": "code", + "execution_count": 25, + "id": "c6672caa-fd51-4dde-a31d-5c4de8c3cc1d", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2023-09-11 16:57:08,807 - INFO - --- input summary of monai.bundle.scripts.ckpt_export ---\n", + "2023-09-11 16:57:08,807 - INFO - > net_id: 'network_def'\n", + "2023-09-11 16:57:08,807 - INFO - > filepath: './MedNISTClassifier_v2/models/model.ts'\n", + "2023-09-11 16:57:08,807 - INFO - > meta_file: './MedNISTClassifier_v2/configs/metadata.json'\n", + "2023-09-11 16:57:08,807 - INFO - > config_file: ['./MedNISTClassifier_v2/configs/common.yaml',\n", + " './MedNISTClassifier_v2/configs/inference.yaml']\n", + "2023-09-11 16:57:08,807 - INFO - > ckpt_file: './MedNISTClassifier_v2/models/model.pt'\n", + "2023-09-11 16:57:08,807 - INFO - > key_in_ckpt: 'model'\n", + "2023-09-11 16:57:08,807 - INFO - > bundle_root: './MedNISTClassifier_v2'\n", + "2023-09-11 16:57:08,807 - INFO - ---\n", + "\n", + "\n", + "2023-09-11 16:57:12,519 - INFO - exported to file: ./MedNISTClassifier_v2/models/model.ts.\n", + "/usr/bin/tree\n", + "\u001b[01;34m./MedNISTClassifier_v2\u001b[00m\n", + "├── \u001b[01;34mconfigs\u001b[00m\n", + "│   ├── common.yaml\n", + "│   ├── inference.yaml\n", + "│   ├── logging.conf\n", + "│   ├── metadata.json\n", + "│   └── train.yaml\n", + "├── \u001b[01;34mdocs\u001b[00m\n", + "│   └── README.md\n", + "├── LICENSE\n", + "└── \u001b[01;34mmodels\u001b[00m\n", + " ├── model.pt\n", + " └── model.ts\n", + "\n", + "3 directories, 9 files\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "BUNDLE=\"./MedNISTClassifier_v2\"\n", + "\n", + "ckpt=$(find output -name 'model_final_iteration=186.pt'|sort|tail -1)\n", + "cp \"$ckpt\" \"$BUNDLE/models/model.pt\"\n", + "\n", + "python -m monai.bundle ckpt_export \\\n", + " --bundle_root \"$BUNDLE\" \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"['$BUNDLE/configs/common.yaml','$BUNDLE/configs/inference.yaml']\" \\\n", + " --net_id network_def \\\n", + " --key_in_ckpt model \\\n", + " --ckpt_file \"$BUNDLE/models/model.pt\" \\\n", + " --filepath \"$BUNDLE/models/model.ts\" \n", + "\n", + "which tree && tree \"$BUNDLE\" || true" + ] + }, + { + "cell_type": "markdown", + "id": "8def15f8-d0dc-4ed0-8bf7-669e0720ac81", + "metadata": {}, + "source": [ + "This will have produced the `model.ts` file in `models` as shown here which can be loaded in Python without the bundle config scripts like any other Torchscript object.\n", + "\n", + "The arguments for the `ckpt_export` command specify the components to use in the config files and the checkpoint:\n", + "* `bundle_root`, `meta_file`, and `config_file` are as in previous usages.\n", + "* `net_id` specifies the object in the config files which represents the network definition, ie. the instantiated network object.\n", + "* `key_in_ckpt` names the key under which the weights for the model are found in the checkpoint, this assumes the checkpoint is a dictionary which is what `CheckpointSaver` produces, if this file isn't a dictionary omit this argument.\n", + "* `ckpt_file` the name of the checkpoint file itself\n", + "* `filepath` the output filename to store the Torchscript object to." + ] + }, + { + "cell_type": "markdown", + "id": "18a62139-8a21-4bb9-96d4-e86d61298c40", + "metadata": {}, + "source": [ + "## Summary and Next\n", + "\n", + "This tutorial has covered MONAI Bundle best practices:\n", + " * Separate common definition config files which are combined with specific application files\n", + " * Separating out definitions in config files for easier reading and changes\n", + " * Using Engine based classes for traning and validation\n", + " * Simple training run management with uniquely-created results directories\n", + " * Inference script to generate a results csv file containing predictions\n", + " \n", + "The next tutorial will discuss creating bundles to wrap pre-existing Pytorch code so that you can get code into the bundle ecosystem without rewriting the world." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python [conda env:monai1]", + "language": "python", + "name": "conda-env-monai1-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.18" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/bundle/04_integrating_code.ipynb b/bundle/04_integrating_code.ipynb new file mode 100644 index 0000000000..ee1986a328 --- /dev/null +++ b/bundle/04_integrating_code.ipynb @@ -0,0 +1,925 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "id": "c0f57371-fbd0-4a3e-94fb-4c9c8aea956c", + "metadata": {}, + "source": [ + "Copyright (c) MONAI Consortium \n", + "Licensed under the Apache License, Version 2.0 (the \"License\"); \n", + "you may not use this file except in compliance with the License. \n", + "You may obtain a copy of the License at \n", + "    http://www.apache.org/licenses/LICENSE-2.0 \n", + "Unless required by applicable law or agreed to in writing, software \n", + "distributed under the License is distributed on an \"AS IS\" BASIS, \n", + "WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. \n", + "See the License for the specific language governing permissions and \n", + "limitations under the License." + ] + }, + { + "cell_type": "markdown", + "id": "91b49f99-5a9f-4bbe-a034-fb8a5f3fc71d", + "metadata": {}, + "source": [ + "## Setup environment" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "cd80c262-cf94-48df-b78e-c54a88a7ffb5", + "metadata": {}, + "outputs": [], + "source": [ + "!python -c \"import monai\" || pip install -q \"monai-weekly[ignite,pyyaml]\"" + ] + }, + { + "cell_type": "markdown", + "id": "c36673a2-02cd-4eea-90ef-8226832c30d0", + "metadata": {}, + "source": [ + "## Setup imports" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "eeeee791-025e-4b1d-9dec-ebc83a8be4eb", + "metadata": {}, + "outputs": [], + "source": [ + "import torchvision\n", + "from monai.config import print_config\n", + "\n", + "print_config()" + ] + }, + { + "cell_type": "markdown", + "id": "0fdad73c-f1ab-4874-9e4e-af687f78801a", + "metadata": {}, + "source": [ + "# Integrating Non-MONAI Code Into a Bundle\n", + "\n", + "This notebook will discuss strategies for integrating non-MONAI deep learning code into a bundle. This allows existing Pytorch workflows to be integrated into the bundle ecosystem, for example as a distributable bundle for the model zoo or some other repository like Hugging Face, or to integrate with MONAI Label. The assumption taken here is that you already have the components for preprocessing, inference, validation, and other parts of a workflow, and so the task is how to integrate these parts into MONAI types which can be embedded in config files.\n", + "\n", + "In the following cells we'll construct a bundle which follows the [CIFAR10 tutorial](https://github.com/pytorch/tutorials/blob/32d834139b8627eeacb5fb2862be9f095fcb0b52/beginner_source/blitz/cifar10_tutorial.py) in Pytorch's tutorials repo. A number of code components will be copied into the `scripts` directory of the bundle and linked into config files suitable to be used on the command line.\n", + "\n", + "We'll start with an initialised bundle with a \"scripts\" directory and provide an appropriate metadata file describing the CIFAR10 classification network we'll provide:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "eb9dc6ec-13da-4a37-8afa-28e2766b9343", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "/usr/bin/tree\n", + "\u001b[01;34mIntegrationBundle\u001b[00m\n", + "├── \u001b[01;34mconfigs\u001b[00m\n", + "│   └── metadata.json\n", + "├── \u001b[01;34mdocs\u001b[00m\n", + "│   └── README.md\n", + "├── LICENSE\n", + "├── \u001b[01;34mmodels\u001b[00m\n", + "└── \u001b[01;34mscripts\u001b[00m\n", + " └── __init__.py\n", + "\n", + "4 directories, 4 files\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "python -m monai.bundle init_bundle IntegrationBundle\n", + "rm IntegrationBundle/configs/inference.json\n", + "mkdir IntegrationBundle/scripts\n", + "echo \"\" > IntegrationBundle/scripts/__init__.py\n", + "\n", + "which tree && tree IntegrationBundle || true" + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "b29f053b-cf16-4ffc-bbe7-d9433fdfa872", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Overwriting IntegrationBundle/configs/metadata.json\n" + ] + } + ], + "source": [ + "%%writefile IntegrationBundle/configs/metadata.json\n", + "\n", + "{\n", + " \"version\": \"0.0.1\",\n", + " \"changelog\": {\n", + " \"0.0.1\": \"Initial version\"\n", + " },\n", + " \"monai_version\": \"1.2.0\",\n", + " \"pytorch_version\": \"2.0.0\",\n", + " \"numpy_version\": \"1.23.5\",\n", + " \"optional_packages_version\": {\n", + " \"torchvision\": \"0.15.0\"\n", + " },\n", + " \"name\": \"IntegrationBundle\",\n", + " \"task\": \"Example Bundle\",\n", + " \"description\": \"This illustrates integrating non-MONAI code (CIFAR10 classification) into a bundle\",\n", + " \"authors\": \"Your Name Here\",\n", + " \"copyright\": \"Copyright (c) Your Name Here\",\n", + " \"data_source\": \"CIFAR10\",\n", + " \"data_type\": \"float32\",\n", + " \"intended_use\": \"This is suitable for demonstration only\",\n", + " \"network_data_format\": {\n", + " \"inputs\": {\n", + " \"image\": {\n", + " \"type\": \"image\",\n", + " \"format\": \"magnitude\",\n", + " \"modality\": \"natural\",\n", + " \"num_channels\": 3,\n", + " \"spatial_shape\": [32, 32],\n", + " \"dtype\": \"float32\",\n", + " \"value_range\": [-1, 1],\n", + " \"is_patch_data\": false,\n", + " \"channel_def\": {\n", + " \"0\": \"red\",\n", + " \"1\": \"green\",\n", + " \"2\": \"blue\"\n", + " }\n", + " }\n", + " },\n", + " \"outputs\": {\n", + " \"pred\": {\n", + " \"type\": \"probabilities\",\n", + " \"format\": \"classes\",\n", + " \"num_channels\": 10,\n", + " \"spatial_shape\": [10],\n", + " \"dtype\": \"float32\",\n", + " \"value_range\": [0, 1],\n", + " \"is_patch_data\": false,\n", + " \"channel_def\": {\n", + " \"0\": \"plane\",\n", + " \"1\": \"car\",\n", + " \"2\": \"bird\",\n", + " \"3\": \"cat\",\n", + " \"4\": \"deer\",\n", + " \"5\": \"dog\",\n", + " \"6\": \"frog\",\n", + " \"7\": \"horse\",\n", + " \"8\": \"ship\",\n", + " \"9\": \"truck\"\n", + " }\n", + " }\n", + " }\n", + " }\n", + "}" + ] + }, + { + "cell_type": "markdown", + "id": "f9eac927-052d-4632-966f-a87f06311b9b", + "metadata": {}, + "source": [ + "Note that `torchvision` was added as an optional package but will be required to run the bundle. \n", + "\n", + "## Scripts\n", + "\n", + "Taking the CIFAR10 tutorial as the \"codebase\" we're using currently, which we want to convert into a bundle, we want to copy components into `scripts` from that codebase. We'll start with the network given in the tutorial:" + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "dcdbe1ae-ea13-49cb-b5a3-3c2c78f91f2b", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing IntegrationBundle/scripts/net.py\n" + ] + } + ], + "source": [ + "%%writefile IntegrationBundle/scripts/net.py\n", + "\n", + "import torch\n", + "import torch.nn as nn\n", + "import torch.nn.functional as F\n", + "\n", + "\n", + "class Net(nn.Module):\n", + " def __init__(self):\n", + " super().__init__()\n", + " self.conv1 = nn.Conv2d(3, 6, 5)\n", + " self.pool = nn.MaxPool2d(2, 2)\n", + " self.conv2 = nn.Conv2d(6, 16, 5)\n", + " self.fc1 = nn.Linear(16 * 5 * 5, 120)\n", + " self.fc2 = nn.Linear(120, 84)\n", + " self.fc3 = nn.Linear(84, 10)\n", + "\n", + " def forward(self, x):\n", + " x = self.pool(F.relu(self.conv1(x)))\n", + " x = self.pool(F.relu(self.conv2(x)))\n", + " x = torch.flatten(x, 1)\n", + " x = F.relu(self.fc1(x))\n", + " x = F.relu(self.fc2(x))\n", + " x = self.fc3(x)\n", + " return x" + ] + }, + { + "cell_type": "markdown", + "id": "e6d11fac-ad12-4f47-a0cb-5c78263e1142", + "metadata": {}, + "source": [ + "Data transforms and data loaders are provided using definitions from `torchvision`. If we assume that these aren't easily converted into MONAI types, we instead need a function to return data loaders which will be used in config files. We could adapt the existing code by simply copying it into a function returning these definitions for use in the bundle:" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "189d71c5-6556-4891-a382-0adbc8f80d30", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing IntegrationBundle/scripts/transforms.py\n" + ] + } + ], + "source": [ + "%%writefile IntegrationBundle/scripts/transforms.py\n", + "\n", + "import torchvision.transforms as transforms\n", + "\n", + "transform = transforms.Compose(\n", + " [transforms.ToTensor(),\n", + " transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])\n" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "3d8f233e-495c-450c-a445-46d295ba7461", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing IntegrationBundle/scripts/dataloaders.py\n" + ] + } + ], + "source": [ + "%%writefile IntegrationBundle/scripts/dataloaders.py\n", + "\n", + "import torch\n", + "import torchvision\n", + "\n", + "batch_size = 4\n", + "\n", + "def get_dataloader(is_training, transform):\n", + " \n", + " if is_training:\n", + " trainset = torchvision.datasets.CIFAR10(root='./data', train=True,\n", + " download=True, transform=transform)\n", + " trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,\n", + " shuffle=True, num_workers=2)\n", + " return trainloader\n", + " else:\n", + " testset = torchvision.datasets.CIFAR10(root='./data', train=False,\n", + " download=True, transform=transform)\n", + " testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,\n", + " shuffle=False, num_workers=2)\n", + " return testloader " + ] + }, + { + "cell_type": "markdown", + "id": "317e2abf-673d-4a84-9afb-187bf01da278", + "metadata": {}, + "source": [ + "The training process in the tutorial is just a loop going through the dataset twice. The simplest adaptation for this is to wrap it in a function taking only the network and dataloader as arguments:" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "1a836b1b-06da-4866-82a2-47d1efed5d7c", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing IntegrationBundle/scripts/train.py\n" + ] + } + ], + "source": [ + "%%writefile IntegrationBundle/scripts/train.py\n", + "\n", + "import torch.nn as nn\n", + "import torch.optim as optim\n", + "\n", + "\n", + "def train(net,trainloader):\n", + " criterion = nn.CrossEntropyLoss()\n", + " optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)\n", + "\n", + " for epoch in range(2): \n", + "\n", + " running_loss = 0.0\n", + " for i, data in enumerate(trainloader, 0):\n", + " inputs, labels = data\n", + "\n", + " optimizer.zero_grad()\n", + "\n", + " outputs = net(inputs)\n", + " loss = criterion(outputs, labels)\n", + " loss.backward()\n", + " optimizer.step()\n", + "\n", + " running_loss += loss.item()\n", + " if i % 2000 == 1999: \n", + " print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / 2000:.3f}')\n", + " running_loss = 0.0\n", + "\n", + " print('Finished Training')\n" + ] + }, + { + "cell_type": "markdown", + "id": "3baf799c-8f3d-4a84-aa0d-6acbe1a0d96b", + "metadata": {}, + "source": [ + "This function will hard code all sorts of parameters like loss function, learning rate, epoch count, etc. For this example it will work but of course if you're adapting other code it would make sense to include more parameterisation to your wrapper components. \n", + "\n", + "## Training\n", + "\n", + "We can now define a training config file:" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "0b9764a8-674c-42ae-ad4b-f2dea027bdbf", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing IntegrationBundle/configs/train.yaml\n" + ] + } + ], + "source": [ + "%%writefile IntegrationBundle/configs/train.yaml\n", + "\n", + "imports:\n", + "- $import torch\n", + "- $import scripts\n", + "- $import scripts.net\n", + "- $import scripts.train\n", + "- $import scripts.transforms\n", + "- $import scripts.dataloaders\n", + "\n", + "net:\n", + " _target_: scripts.net.Net\n", + "\n", + "transforms: '$scripts.transforms.transform'\n", + "\n", + "dataloader: '$scripts.dataloaders.get_dataloader(True, @transforms)'\n", + "\n", + "train:\n", + "- $scripts.train.train(@net, @dataloader)\n", + "- $torch.save(@net.state_dict(), './cifar_net.pth')\n" + ] + }, + { + "cell_type": "markdown", + "id": "e6c88aea-8182-44f1-853c-7d728bdae45b", + "metadata": {}, + "source": [ + "The key concept demonstrated here is how to refer to definitions in the `scripts` directory within a config file and tie them together into a program. These definitions can be existing types or wrapper functions around existing code to make them easier to refer to here. A lot of good practice is ignored here but it shows how to adapt code into a bundle with minimal changes.\n", + "\n", + "Let's train something simple with this setup:" + ] + }, + { + "cell_type": "code", + "execution_count": 1, + "id": "65149911-3771-4a49-ade6-378305a4b946", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2023-09-11 17:28:16,125 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-09-11 17:28:16,125 - INFO - > run_id: 'train'\n", + "2023-09-11 17:28:16,125 - INFO - > meta_file: './IntegrationBundle/configs/metadata.json'\n", + "2023-09-11 17:28:16,125 - INFO - > config_file: './IntegrationBundle/configs/train.yaml'\n", + "2023-09-11 17:28:16,125 - INFO - > bundle_root: './IntegrationBundle'\n", + "2023-09-11 17:28:16,125 - INFO - ---\n", + "\n", + "\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Default logging file in 'configs/logging.conf' does not exist, skipping logging.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "100%|██████████| 170498071/170498071 [00:56<00:00, 3010200.83it/s]\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Extracting ./data/cifar-10-python.tar.gz to ./data\n", + "[1, 2000] loss: 2.162\n", + "[1, 4000] loss: 1.888\n", + "[1, 6000] loss: 1.688\n", + "[1, 8000] loss: 1.580\n", + "[1, 10000] loss: 1.487\n", + "[1, 12000] loss: 1.446\n", + "[2, 2000] loss: 1.402\n", + "[2, 4000] loss: 1.392\n", + "[2, 6000] loss: 1.339\n", + "[2, 8000] loss: 1.317\n", + "[2, 10000] loss: 1.276\n", + "[2, 12000] loss: 1.275\n", + "Finished Training\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "BUNDLE=\"./IntegrationBundle\"\n", + "\n", + "export PYTHONPATH=$BUNDLE\n", + "\n", + "python -m monai.bundle run train \\\n", + " --bundle_root \"$BUNDLE\" \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"$BUNDLE/configs/train.yaml\" " + ] + }, + { + "cell_type": "markdown", + "id": "1c27ba04-3271-4119-a57a-698aa7a83409", + "metadata": {}, + "source": [ + "## Testing \n", + "\n", + "The second part of the tutorial script is testing the network with the test data which can again be put into a simple routine called from a config file: " + ] + }, + { + "cell_type": "code", + "execution_count": 2, + "id": "fc35814e-625d-4871-ac1c-200a0cc562d9", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing IntegrationBundle/scripts/test.py\n" + ] + } + ], + "source": [ + "%%writefile IntegrationBundle/scripts/test.py\n", + "\n", + "import torch\n", + "\n", + "\n", + "def test(net, testloader):\n", + " correct = 0\n", + " total = 0\n", + " \n", + " with torch.no_grad():\n", + " for data in testloader:\n", + " images, labels = data\n", + " outputs = net(images)\n", + " _, predicted = torch.max(outputs.data, 1)\n", + " total += labels.size(0)\n", + " correct += (predicted == labels).sum().item()\n", + "\n", + " print(f'Accuracy of the network on the 10000 test images: {100 * correct // total} %')\n", + " " + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "id": "fb49aef2-9fb5-4e74-83d2-9da935e07648", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Writing IntegrationBundle/configs/test.yaml\n" + ] + } + ], + "source": [ + "%%writefile IntegrationBundle/configs/test.yaml\n", + "\n", + "imports:\n", + "- $import torch\n", + "- $import scripts\n", + "- $import scripts.test\n", + "- $import scripts.transforms\n", + "- $import scripts.dataloaders\n", + "\n", + "net:\n", + " _target_: scripts.net.Net\n", + "\n", + "transforms: '$scripts.transforms.transform'\n", + "\n", + "dataloader: '$scripts.dataloaders.get_dataloader(False, @transforms)'\n", + "\n", + "test:\n", + "- $@net.load_state_dict(torch.load('./cifar_net.pth'))\n", + "- $scripts.test.test(@net, @dataloader)\n" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "ab171286-045c-4067-a2ea-be359168869d", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2023-09-11 17:31:17,644 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-09-11 17:31:17,644 - INFO - > run_id: 'test'\n", + "2023-09-11 17:31:17,644 - INFO - > meta_file: './IntegrationBundle/configs/metadata.json'\n", + "2023-09-11 17:31:17,644 - INFO - > config_file: './IntegrationBundle/configs/test.yaml'\n", + "2023-09-11 17:31:17,644 - INFO - > bundle_root: './IntegrationBundle'\n", + "2023-09-11 17:31:17,644 - INFO - ---\n", + "\n", + "\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Default logging file in 'configs/logging.conf' does not exist, skipping logging.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Files already downloaded and verified\n", + "Accuracy of the network on the 10000 test images: 54 %\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "BUNDLE=\"./IntegrationBundle\"\n", + "\n", + "export PYTHONPATH=$BUNDLE\n", + "\n", + "python -m monai.bundle run test \\\n", + " --bundle_root \"$BUNDLE\" \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"$BUNDLE/configs/test.yaml\" " + ] + }, + { + "cell_type": "markdown", + "id": "4f218b72-734b-4b6e-93e5-990b8c647e8a", + "metadata": {}, + "source": [ + "## Inference\n", + "\n", + "The original script lacked a section on inference with the network, however this is rather straight forward and so a script and config file can easily implement this:" + ] + }, + { + "cell_type": "code", + "execution_count": 8, + "id": "1f510a23-aa3a-4e34-81e2-b4c719d87939", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Overwriting IntegrationBundle/scripts/inference.py\n" + ] + } + ], + "source": [ + "%%writefile IntegrationBundle/scripts/inference.py\n", + "\n", + "import torch\n", + "from PIL import Image\n", + "\n", + "\n", + "def inference(net, transforms, filenames):\n", + " for fn in filenames:\n", + " with Image.open(fn) as im:\n", + " tim=transforms(im)\n", + " outputs=net(tim[None])\n", + " _, predictions = torch.max(outputs, 1)\n", + " print(fn, predictions[0].item())" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "id": "7f1251be-f0dd-4cbf-8903-3f3769c8049c", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "Overwriting IntegrationBundle/configs/inference.yaml\n" + ] + } + ], + "source": [ + "%%writefile IntegrationBundle/configs/inference.yaml\n", + "\n", + "imports:\n", + "- $import glob\n", + "- $import torch\n", + "- $import scripts\n", + "- $import scripts.inference\n", + "- $import scripts.transforms\n", + "\n", + "ckpt_path: './cifar_net.pth'\n", + "\n", + "input_dir: 'test_cifar10'\n", + "input_files: '$sorted(glob.glob(@input_dir+''/*.*''))'\n", + "\n", + "net:\n", + " _target_: scripts.net.Net\n", + "\n", + "transforms: '$scripts.transforms.transform'\n", + "\n", + "inference:\n", + "- $@net.load_state_dict(torch.load('./cifar_net.pth'))\n", + "- $scripts.inference.inference(@net, @transforms, @input_files)" + ] + }, + { + "cell_type": "markdown", + "id": "e14c3ea9-5d0f-4c62-9cfe-c3c02c7fe6e1", + "metadata": {}, + "source": [ + "Here we'll create a test set of image files to load and predict on:" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "id": "cc2f063b-43f4-403e-b963-cf42b7e08637", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "test_cifar10/img00.png Label: 3\n", + "test_cifar10/img01.png Label: 8\n", + "test_cifar10/img02.png Label: 8\n", + "test_cifar10/img03.png Label: 0\n", + "test_cifar10/img04.png Label: 6\n", + "test_cifar10/img05.png Label: 6\n", + "test_cifar10/img06.png Label: 1\n", + "test_cifar10/img07.png Label: 6\n", + "test_cifar10/img08.png Label: 3\n", + "test_cifar10/img09.png Label: 1\n", + "test_cifar10/img10.png Label: 0\n", + "test_cifar10/img11.png Label: 9\n", + "test_cifar10/img12.png Label: 5\n", + "test_cifar10/img13.png Label: 7\n", + "test_cifar10/img14.png Label: 9\n", + "test_cifar10/img15.png Label: 8\n", + "test_cifar10/img16.png Label: 5\n", + "test_cifar10/img17.png Label: 7\n", + "test_cifar10/img18.png Label: 8\n", + "test_cifar10/img19.png Label: 6\n" + ] + } + ], + "source": [ + "root_dir = \".\" # assuming CIFAR10 was downloaded to the current directory\n", + "num_images = 20\n", + "dataset = torchvision.datasets.CIFAR10(root=f\"{root_dir}/data\", train=False)\n", + "\n", + "!mkdir -p test_cifar10\n", + "\n", + "for i in range(num_images):\n", + " pil, label = dataset[i]\n", + " filename = f\"test_cifar10/img{i:02}.png\"\n", + " print(filename, \"Label:\", label)\n", + " pil.save(filename)" + ] + }, + { + "cell_type": "code", + "execution_count": 24, + "id": "28d1230e-1d3a-4929-a266-e5f763dfde7f", + "metadata": { + "tags": [] + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "2023-09-11 17:54:11,793 - INFO - --- input summary of monai.bundle.scripts.run ---\n", + "2023-09-11 17:54:11,793 - INFO - > run_id: 'inference'\n", + "2023-09-11 17:54:11,793 - INFO - > meta_file: './IntegrationBundle/configs/metadata.json'\n", + "2023-09-11 17:54:11,793 - INFO - > config_file: './IntegrationBundle/configs/inference.yaml'\n", + "2023-09-11 17:54:11,793 - INFO - > bundle_root: './IntegrationBundle'\n", + "2023-09-11 17:54:11,793 - INFO - ---\n", + "\n", + "\n" + ] + }, + { + "name": "stderr", + "output_type": "stream", + "text": [ + "Default logging file in 'configs/logging.conf' does not exist, skipping logging.\n" + ] + }, + { + "name": "stdout", + "output_type": "stream", + "text": [ + "test_cifar10/img00.png 3\n", + "test_cifar10/img01.png 8\n", + "test_cifar10/img02.png 8\n", + "test_cifar10/img03.png 0\n", + "test_cifar10/img04.png 6\n", + "test_cifar10/img05.png 6\n", + "test_cifar10/img06.png 1\n", + "test_cifar10/img07.png 4\n", + "test_cifar10/img08.png 3\n", + "test_cifar10/img09.png 1\n", + "test_cifar10/img10.png 0\n", + "test_cifar10/img11.png 9\n", + "test_cifar10/img12.png 6\n", + "test_cifar10/img13.png 7\n", + "test_cifar10/img14.png 9\n", + "test_cifar10/img15.png 1\n", + "test_cifar10/img16.png 5\n", + "test_cifar10/img17.png 3\n", + "test_cifar10/img18.png 8\n", + "test_cifar10/img19.png 4\n" + ] + } + ], + "source": [ + "%%bash\n", + "\n", + "BUNDLE=\"./IntegrationBundle\"\n", + "\n", + "export PYTHONPATH=$BUNDLE\n", + "\n", + "python -m monai.bundle run inference \\\n", + " --bundle_root \"$BUNDLE\" \\\n", + " --meta_file \"$BUNDLE/configs/metadata.json\" \\\n", + " --config_file \"$BUNDLE/configs/inference.yaml\" " + ] + }, + { + "cell_type": "markdown", + "id": "a1a06d82-1a8a-4607-8620-474e89061027", + "metadata": {}, + "source": [ + "## Adaptation Strategies\n", + "\n", + "This notebook has demonstrated one strategy of integrating existing code into a bundle. Code from an existing project, in this case an example script, was copied into the `scripts` directory of a bundle with added functions to make definitions easily referenced in config files. What shows up in the config files is a thin adapter layer to interface between what is expected in bundles and the codebase. \n", + "\n", + "It's clear that a mixed approach, where old components are replaced with MONAI types, would also work well given the simplicity of the code here. Substituting the Torchvision transforms with those from MONAI, using a `Trainer` class instead of the `train` function, and similarly testing and inference using an `Evaluator` class, would produce essentially the same results. It is up to you to determine what rewriting of code in the config scripts is justified for your codebase rather than adapting things in some way. \n", + "\n", + "The third approach involves a codebase which is installed as a package. If an external network with its training components is installed with `pip` for example, perhaps no code would be needed to adapt into a bundle, and you can just write config scripts to import this package and reference its definitions. Some adapter code may be needed in `scripts` but this could be like those demonstrated here, simple wrapper functions returning objects assigned to keys in config files through evaluated Python expressions. \n", + "\n", + "Creating a bundle compatible with other tools requires you to define specific items in the config files. For example, MONAI Label states requirements [here](https://github.com/Project-MONAI/MONAILabel/blob/c90f42c0730554e3a05af93645ae84ccdcb5e14b/monailabel/tasks/infer/bundle.py#L33) as names that must be present in `inference.json/yaml` to work with the label server. You would have to provide `network_def`, `preprocessing`, `postprocessing`, and others. This means that the code from your existing codebase would have to be divided up into these components if it isn't already, and its inputs and output would have to match what would be expected of the MONAI types typically used for these definitions. \n", + "\n", + "If you need to adapt your code to a bundle it's going to be very specific to your situation how integration is going to work. Using config files as adapter layers is shown here to work, but by understanding how bundles are structured and what the moving pieces are to a bundle \"program\" you can figure out your own strategy.\n", + "\n", + "### Adapting Data Processing\n", + "\n", + "One common module is data processing, either pre or post at various stages. MONAI transforms assume that Numpy arrays or Pytorch tensors, or dictionaries thereof, are the inputs and outputs to transforms. You can integrate existing transforms using `Lambda/Lambdad` to wrap a callable object within a MONAI transform rather than define your own `Transform` subclass. This does require that data have the correct type and shape. For example, if you have a function in `scripts` simply called `preprocess` which accepts a single image input as a Numpy array, this can be adapted into a transform sequence as such:\n", + "\n", + "```python\n", + "train_transforms:\n", + "- _target_: LoadImage\n", + " image_only: true\n", + "- _target_: EnsureChannelFirst\n", + "- _target_: ToNumpy\n", + "- _target_: Lambda\n", + " func: '$@scripts.preprocess'\n", + "- _target_: ToTensor\n", + "```\n", + "\n", + "Minimising conversions to and from different formats would improve performance but otherwise this avoids complex rewriting of code to fit MONAI tranforms. A preprocess function which takes multiple inputs and produces multiple outputs would be more suited to a dictionary-based transform sequence but would also require adaptor code or a `MapTransform` subclass. \n", + "\n", + "\n", + "## Summary and Next\n", + "\n", + "In this tutorial we have looked at how to adapt code to a MONAI bundle:\n", + "* Wrapping code in thin adaptation layers\n", + "* Using these components in config files\n", + "* Discussion of the architectural concepts around the process of adaptation\n", + "\n", + "In future tutorials we shall delve into other details and strategies with MONAI bundles." + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python [conda env:monai1]", + "language": "python", + "name": "conda-env-monai1-py" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.9.18" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +} diff --git a/bundle/README.md b/bundle/README.md index 4175ce470c..2b54ba9d0b 100644 --- a/bundle/README.md +++ b/bundle/README.md @@ -1,14 +1,33 @@ -# MONAI bundle -This folder contains the `getting started` tutorial and below code examples of training / inference for MONAI bundle. +# MONAI Bundle -### [introducing_config](./introducing_config) -A simple example to introduce the MONAI bundle config and parsing. +This directory contains the tutorials and materials for MONAI Bundles. A bundle is a self-describing network which +packages network weights, training/validation/testing scripts, Python code, and ancillary files into a defined +directory structure. Bundles can be downloaded from the model zoo and other sources using MONAI's inbuilt API. +These tutorials start with an introduction on how to construct bundles from scratch, and then go into more depth +on specific features. -### [customize component](./custom_component) -Example shows the use cases of bringing customized python components, such as transform, network, and metrics, in a configuration-based workflow. +All other bundle documentation can be found at https://docs.monai.io/en/latest/bundle_intro.html. -### [hybrid programming](./hybrid_programming) -Example shows how to parse the config files in your own python program, instantiate necessary components with python program and execute the inference. +Start the tutorial notebooks on constructing bundles: -### [python bundle workflow](./python_bundle_workflow) -Step by step tutorial examples show how to develop a bundle training or inference workflow in Python instead of JSON / YAML configs. +1. [Bundle Introduction](./01_bundle_intro.ipynb): create a very simple bundle from scratch. +2. [MedNIST Classification](./02_mednist_classification.ipynb): train a network using the bundle for doing a real task. +3. [MedNIST Classification With Best Practices](./03_mednist_classification_v2.ipynb): do the same again but better. +4. [Integrating Existing Code](./04_integrating_code.ipynb): discussion on how to integrate existing, possible non-MONAI, code into a bundle. + +More advanced topics are covered in this directory: + +* [Further Features](./further_features.md): covers more advanced features and uses of configs, command usage, and +programmatic use of bundles. + +* [introducing_config](./introducing_config): a simple example to introduce the MONAI bundle config and parsing +implementing a standalone program. + +* [customize component](./custom_component): illustrates bringing customized python components, such as transform, +network, and metrics, into a configuration-based workflow. + +* [hybrid programming](./hybrid_programming): shows how to parse the config files in your own python program, +instantiate necessary components with python program and execute the inference. + +* [python bundle workflow](./python_bundle_workflow): step-by-step tutorial examples show how to develop a bundle +training or inference workflow in Python instead of JSON / YAML configs. diff --git a/bundle/get_started.md b/bundle/further_features.md similarity index 99% rename from bundle/get_started.md rename to bundle/further_features.md index 51d034a9aa..b1c1480c9b 100644 --- a/bundle/get_started.md +++ b/bundle/further_features.md @@ -1,5 +1,5 @@ -# Get started to MONAI bundle +# Further Features of MONAI Bundles A MONAI bundle usually includes the stored weights of a model, TorchScript model, JSON files which include configs and metadata about the model, information for constructing training, inference, and post-processing transform sequences, plain-text description, legal information, and other data the model creator wishes to include. diff --git a/runner.sh b/runner.sh index 1bd0cf82e2..373cc39599 100755 --- a/runner.sh +++ b/runner.sh @@ -72,6 +72,10 @@ doesnt_contain_max_epochs=("${doesnt_contain_max_epochs[@]}" TensorRT_inference_ doesnt_contain_max_epochs=("${doesnt_contain_max_epochs[@]}" lazy_resampling_benchmark.ipynb) doesnt_contain_max_epochs=("${doesnt_contain_max_epochs[@]}" modular_patch_inferer.ipynb) doesnt_contain_max_epochs=("${doesnt_contain_max_epochs[@]}" GDS_dataset.ipynb) +doesnt_contain_max_epochs=("${doesnt_contain_max_epochs[@]}" 01_bundle_intro.ipynb) +doesnt_contain_max_epochs=("${doesnt_contain_max_epochs[@]}" 02_mednist_classification.ipynb) +doesnt_contain_max_epochs=("${doesnt_contain_max_epochs[@]}" 03_mednist_classification_v2.ipynb) +doesnt_contain_max_epochs=("${doesnt_contain_max_epochs[@]}" 04_integrating_code.ipynb) # Execution of the notebook in these folders / with the filename cannot be automated skip_run_papermill=() @@ -111,6 +115,10 @@ skip_run_papermill=("${skip_run_papermill[@]}" .*mednist_classifier_ray*) # htt skip_run_papermill=("${skip_run_papermill[@]}" .*TorchIO_MONAI_PyTorch_Lightning*) # https://github.com/Project-MONAI/tutorials/issues/1324 skip_run_papermill=("${skip_run_papermill[@]}" .*GDS_dataset*) # https://github.com/Project-MONAI/tutorials/issues/1324 skip_run_papermill=("${skip_run_papermill[@]}" .*learn2reg_nlst_paired_lung_ct.ipynb*) # slow test +skip_run_papermill=("${skip_run_papermill[@]}" .*01_bundle_intro.ipynb*) +skip_run_papermill=("${skip_run_papermill[@]}" .*02_mednist_classification.ipynb*) +skip_run_papermill=("${skip_run_papermill[@]}" .*03_mednist_classification_v2.ipynb*) +skip_run_papermill=("${skip_run_papermill[@]}" .*04_integrating_code.ipynb*) # output formatting separator=""