Project-MONAI · holgerroth · Oct 29, 2024 · Oct 28, 2024 · Oct 29, 2024
diff --git a/m3/data_prepare/README.md b/m3/data_prepare/README.md
@@ -1,9 +1,9 @@
-# MONAI-VILA: Data Preparation
+# Data Preparation
 
 Preparing the datasets for VILA training and testing requires three steps:
-1. Downloading all the datasets (Information to download each dataset is provided in the readme.md for the `vqa`, `report` and `expert` directories)
+1. Downloading all the datasets (Information to download each dataset is provided in the readme.md for the [vqa](./vqa/README.md), [report](./report/README.md) and [expert](./experts/README.md) directories)
 2. Generating the instruction data for all datasets (Information to generate the instruction data is provided in the readme.md for the `vqa`, `report` and `expert` directory)
-3. Adding the prepared datasets to VILA in a data mixture (More information can be found in the [quickstart guide](../train/readme.md))
+3. Adding the prepared datasets to VILA in a data mixture (More information can be found in the [quickstart guide](../train/README.md))
 
 ### VQA Datasets
 - **PathVQA**: Pathology-based VQA dataset with ~4,000 images and ~32,000 QA pairs, focusing on microscopic views of human tissue.
@@ -37,7 +37,7 @@ A test set of the CheXpert dataset consisting of 500 studies from 500 patients r
 | **Totals**                                                                                                           | **>800,000**  | **>427,000** |                   |
 
 ### Expert Model Datasets
-To create datasets for training expert model selection capablities, please follow the instructions in the [expert](./experts/README.md) directory.
+To create datasets for training expert model selection capabilities, please follow the instructions in the [expert](./experts/README.md) directory.
 
 We use the following datasets and expert models for expert model selection.
 

diff --git a/m3/data_prepare/experts/README.md b/m3/data_prepare/experts/README.md
@@ -42,4 +42,4 @@ python expert_train_data_brats.py --in_meta_data ${META_DATA} --images_root ${RO
 ```
 
 ### 2. Prepare expert training data for TorchXRayVision
-Coming soon...
+For details on how to prepare training & evaluation data with an TorchXRayVision expert model ensemble, see [here](./torchxrayvision/README.md).
diff --git a/m3/data_prepare/experts/torchxrayvision/README.md b/m3/data_prepare/experts/torchxrayvision/README.md
@@ -1,5 +1,8 @@
 ### Integrating expert model data with torchxrayvision
 
+### Prerequisit: Data download & install dependencies
+Follow the links for CXR datasets this [README.md](../../README.md#chest-x-ray-classification-datasets-for-model-evaluation). 
+
 For both training and inference, `pip install` packages `torchxrayvision` (for the chest X-ray models), `monai` (for json files writing) and `scikit-image` (for image reading)
 are required.  The steps were tested with `torchxrayvision==1.2.4`, `monai==1.3.2` and `scikit-image==0.24.0`.
 The corresponding image data are described in the `data_prepare` folder's [readme file](../../README.md).

diff --git a/m3/train/readme.md → m3/train/README.md b/m3/train/readme.md → m3/train/README.md
@@ -31,28 +31,28 @@ Please make sure that the correct labels of datasets are used in the bash script
 
 Below are examples of how datasets are added to the datasets_mixture.py (Append these at the end of the datasets_mixture.py file):
 
-    ```
-    radvqa = Dataset(
-        dataset_name="radvqa",
-        dataset_type="torch",
-        data_path="/set/path/to/instruction/json/file",
-        image_path="/set/path/to/image/folder",
-    )
-    add_dataset(vn_radvqa)
-
-    slake = Dataset(
-        dataset_name="slake",
-        dataset_type="torch",
-        data_path="/set/path/to/instruction/json/file",
-        image_path="/set/path/to/image/folder",
-    )
-    add_dataset(slake)
-
-    pathvqa = Dataset(
-        dataset_name="pathvqa",
-        dataset_type="torch",
-        data_path="/set/path/to/instruction/json/file",
-        image_path="/set/path/to/image/folder",
-    )
-    add_dataset(pathvqa)
-    ```
+```
+radvqa = Dataset(
+    dataset_name="radvqa",
+    dataset_type="torch",
+    data_path="/set/path/to/instruction/json/file",
+    image_path="/set/path/to/image/folder",
+)
+add_dataset(vn_radvqa)
+
+slake = Dataset(
+    dataset_name="slake",
+    dataset_type="torch",
+    data_path="/set/path/to/instruction/json/file",
+    image_path="/set/path/to/image/folder",
+)
+add_dataset(slake)
+
+pathvqa = Dataset(
+    dataset_name="pathvqa",
+    dataset_type="torch",
+    data_path="/set/path/to/instruction/json/file",
+    image_path="/set/path/to/image/folder",
+)
+add_dataset(pathvqa)
+```