Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update some readmes #44

Merged
merged 2 commits into from
Oct 29, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions m3/data_prepare/README.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
# MONAI-VILA: Data Preparation
# Data Preparation

Preparing the datasets for VILA training and testing requires three steps:
1. Downloading all the datasets (Information to download each dataset is provided in the readme.md for the `vqa`, `report` and `expert` directories)
1. Downloading all the datasets (Information to download each dataset is provided in the readme.md for the [vqa](./vqa/README.md), [report](./report/README.md) and [expert](./experts/README.md) directories)
2. Generating the instruction data for all datasets (Information to generate the instruction data is provided in the readme.md for the `vqa`, `report` and `expert` directory)
3. Adding the prepared datasets to VILA in a data mixture (More information can be found in the [quickstart guide](../train/readme.md))
3. Adding the prepared datasets to VILA in a data mixture (More information can be found in the [quickstart guide](../train/README.md))

### VQA Datasets
- **PathVQA**: Pathology-based VQA dataset with ~4,000 images and ~32,000 QA pairs, focusing on microscopic views of human tissue.
Expand Down Expand Up @@ -37,7 +37,7 @@ A test set of the CheXpert dataset consisting of 500 studies from 500 patients r
| **Totals** | **>800,000** | **>427,000** | |

### Expert Model Datasets
To create datasets for training expert model selection capablities, please follow the instructions in the [expert](./experts/README.md) directory.
To create datasets for training expert model selection capabilities, please follow the instructions in the [expert](./experts/README.md) directory.

We use the following datasets and expert models for expert model selection.

Expand Down
2 changes: 1 addition & 1 deletion m3/data_prepare/experts/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -42,4 +42,4 @@ python expert_train_data_brats.py --in_meta_data ${META_DATA} --images_root ${RO
```

### 2. Prepare expert training data for TorchXRayVision
Coming soon...
For details on how to prepare training & evaluation data with an TorchXRayVision expert model ensemble, see [here](./torchxrayvision/README.md).
3 changes: 3 additions & 0 deletions m3/data_prepare/experts/torchxrayvision/README.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,8 @@
### Integrating expert model data with torchxrayvision

### Prerequisit: Data download & install dependencies
Follow the links for CXR datasets this [README.md](../../README.md#chest-x-ray-classification-datasets-for-model-evaluation).

For both training and inference, `pip install` packages `torchxrayvision` (for the chest X-ray models), `monai` (for json files writing) and `scikit-image` (for image reading)
are required. The steps were tested with `torchxrayvision==1.2.4`, `monai==1.3.2` and `scikit-image==0.24.0`.
The corresponding image data are described in the `data_prepare` folder's [readme file](../../README.md).
Expand Down
50 changes: 25 additions & 25 deletions m3/train/readme.md → m3/train/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,28 +31,28 @@ Please make sure that the correct labels of datasets are used in the bash script

Below are examples of how datasets are added to the datasets_mixture.py (Append these at the end of the datasets_mixture.py file):

```
radvqa = Dataset(
dataset_name="radvqa",
dataset_type="torch",
data_path="/set/path/to/instruction/json/file",
image_path="/set/path/to/image/folder",
)
add_dataset(vn_radvqa)

slake = Dataset(
dataset_name="slake",
dataset_type="torch",
data_path="/set/path/to/instruction/json/file",
image_path="/set/path/to/image/folder",
)
add_dataset(slake)

pathvqa = Dataset(
dataset_name="pathvqa",
dataset_type="torch",
data_path="/set/path/to/instruction/json/file",
image_path="/set/path/to/image/folder",
)
add_dataset(pathvqa)
```
```
radvqa = Dataset(
dataset_name="radvqa",
dataset_type="torch",
data_path="/set/path/to/instruction/json/file",
image_path="/set/path/to/image/folder",
)
add_dataset(vn_radvqa)

slake = Dataset(
dataset_name="slake",
dataset_type="torch",
data_path="/set/path/to/instruction/json/file",
image_path="/set/path/to/image/folder",
)
add_dataset(slake)

pathvqa = Dataset(
dataset_name="pathvqa",
dataset_type="torch",
data_path="/set/path/to/instruction/json/file",
image_path="/set/path/to/image/folder",
)
add_dataset(pathvqa)
```
Loading