Medical Image Classification - Disease Detection

Package Used：

PyTorch
numpy, pandas
torchvision
PIL
sklearn
torchsummary

Reproduction

We use multiple models to form one ensemble model. They mainly differ in validation data, dropout and linear_drop. All models other than model_4 and model_5 are validated on the first 1/10 portion of data. Parameters of all models are listed below.

To reproduce the training result, we have to modify the script a little bit because the partition of training and validation set are determined in the python scripts. If we need to adjust the partition, we have to modify the numbers in genLabels_Partition in load.py. Or, you can run shuffle.py to make the results more random.

The parameters of all seven models:

model_1: dropout = 0.4, linear_drop = 0.2
model_2: dropout = 0.2, linear_drop = 0.2
model_3: dropout = 0.5, linear_drop = 0.2
model_4: dropout = 0.5, linear_drop = 0.2 #validation data => 0.2-0.3
model_5: dropout = 0.5, linear_drop = 0.2 #validation data =>0.3-0.4
model_6: dropout = 0.5, linear_drop = 0.4
model_7: dropout = 0, linear_drop = 0

Run the Code

Preprocessing

Before training, we have to execute shuffle.py first in order to ensure similar data distribution in training and validation set. (Thus we can prevent from grouping similar data in validation set.) The command are as shown below:
python3 shuffle.py [input_name] (e.g. -python3 shuffle.py ./train.csv)
The output file label_only.csv will be the input file for training.

Run Training：

python3 train_600.py [input_file] [root_dir] [dropout] [lineardrop]

input_file is the file generated by shuffle.py and should be label_only.csv
root_dir is where the images are stored
dropout and lineardrop are hyperparameters (dropout:the dropout rate of denseblock; linear_drop: the dropout rate of the linear classifier)
Example Input：python3 train_600.py ./label_only.csv ./images 0.5 0.2

After training, we save the models with validation score over a certain threshold in ./result/ directory. We will later use these models for testing.

Run Testing:

One Model

python3 evaluate_600.py [input_file] [model_file] [root_dir_of_image] [output_file]
(e.g python3 evaluate_600.py ./test.csv ./train_best.model ./image.csv ./result.csv)

The output is the testing result of a single model.

Ensemble

bash fianl_test.sh [input_file] [output_file] [root_dir_of_image]
(e.g. bash final_test.sh ./test.csv ./result.csv ./ntu_final/images)

Also, the models in final/result are needed for testing (the paths are written in ./result/model_name).

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
result		result
Report.pdf		Report.pdf
data_class_train.py		data_class_train.py
data_classes.py		data_classes.py
densenet.py		densenet.py
ensemble.py		ensemble.py
evaluate_600.py		evaluate_600.py
final_test.sh		final_test.sh
label_only.csv		label_only.csv
load.py		load.py
readme.md		readme.md
shuffle.py		shuffle.py
train_600.py		train_600.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Medical Image Classification - Disease Detection

Package Used：

Reproduction

Run the Code

Preprocessing

Run Training：

Run Testing:

One Model

Ensemble

About

Releases

Packages

Languages

timchen0618/NTU_FINAL_DEEPQ

Folders and files

Latest commit

History

Repository files navigation

Medical Image Classification - Disease Detection

Package Used：

Reproduction

Run the Code

Preprocessing

Run Training：

Run Testing:

One Model

Ensemble

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages