ML_Project

Description

More than 500 million people worldwide suffer from type 2 diabetes and even more live at an increased risk of developing the disease. Type 2 diabetes is the most common type of diabetes and can lead to heart diseases, strokes and other severe conditions. Therefore, an early diagnosis and risk assessment is key for the prevention of serious damage. Nevertheless, people with this type of diabetes often live many years without being diagnosed although there exist specific factors that could indicate a high risk of a patient. Machine learning techniques can help predicting type 2 diabetes based on given risk factors and hence lead to earlier and better treatment.

Following, the aim of this work is to investigate the performance of different machine learning models for the prediction of type 2 diabetes based on given risk factors. Since missing a diagnosis of type 2 diabetes in early stages can have life-threatening consequences, our secondary goal will be to minimize the number of false negatives in our models.

Installation

conda create --name my_env
conda activate my_env
conda install pip
pip install -r requirements.txt

Set the conda environment on jupyter notebook

conda install -c anaconda ipykernel
python -m ipykernel install --user --name=my_env

Instructions

Please refer to the notebooks listed below. Each notebook shows the final result as well as intermediate results.

1. Preprocessing

EDA_and_Preprocessing.ipynb

2. Models

3. Finetuning

Fine_tuning.ipynb

4. Evaluation

Final evaluation.ipynb

Other important files

Dataset

Raw dataset
Unbalanced dataset and oversampled dataset after preprocessing

Models saved after hyperparameter search

We saved our models after grid and randomized search. They can be found as .pkl files in data directory.

Related materials

Please refer to sources directory for our sources of evaluation as well as our project proposal

Name		Name	Last commit message	Last commit date
Latest commit History 103 Commits
.ipynb_checkpoints		.ipynb_checkpoints
data		data
sources		sources
EDA_and_Preprocessing.ipynb		EDA_and_Preprocessing.ipynb
Final evaluation.ipynb		Final evaluation.ipynb
Fine_tuning.ipynb		Fine_tuning.ipynb
Functions.ipynb		Functions.ipynb
Logistic Regression.ipynb		Logistic Regression.ipynb
README.md		README.md
SVM.ipynb		SVM.ipynb
Tree Classifier - Random Forest.ipynb		Tree Classifier - Random Forest.ipynb
XGBoost.ipynb		XGBoost.ipynb
neural_network.ipynb		neural_network.ipynb
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML_Project

Description

Installation

Instructions

1. Preprocessing

2. Models

3. Finetuning

4. Evaluation

Other important files

Dataset

Models saved after hyperparameter search

Related materials

About

Releases

Packages

Contributors 4

Languages

kimlindner/Type2DiabetesDetection_ML_Project

Folders and files

Latest commit

History

Repository files navigation

ML_Project

Description

Installation

Instructions

1. Preprocessing

2. Models

3. Finetuning

4. Evaluation

Other important files

Dataset

Models saved after hyperparameter search

Related materials

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages