Heart Disease Prediction

In this model we have predicted the risk of heart disease based on realtime dataset that we have got from a medical university chennai.

DATASET

The dataset we used contians more than 150 patient records and their daily routine data. As a result it helps our model to train with realtime data.Our data set contains many features like age,Dietary habitat,Alcohol consumer or not etc..

Dataset: https://archive.ics.uci.edu/ml/datasets/Heart+Disease

Data Preprocessing

We have preprocessed our dataset using Sk-learn.

sk-learn

The sklearn.preprocessing package provides several common utility functions and transformer classes to change raw feature vectors into a representation that is more suitable for the downstream estimators. In general, learning algorithms benefit from standardization of the data set.

Models Used:

SVM
Naive Bayes
Logistic Regression
Decision Tree
Random Forest
LightGBM
XGboost

Model's Result Prediction Percentage:

SVM:

Support Vector Machine

    ~Training Set Prediction : 0.6694214876033058
    
    ~Testing Set Prediction : 0.5737704918032787

Naive Bayes:

Works based on Naive Bayes algorithm

    ~Training Set Prediction : 0.8677685950413223
    
    ~Testing Set Prediction :  0.7868852459016393

Logistic Regression:

Logistic regression estimates the probability of an disease  based on a given dataset of independent variables

    ~Training Set Prediction : 0.8636363636363636
    
    ~Testing Set Prediction : 0.8032786885245902

Decision Tree:

A decision tree is a non-parametric supervised learning algorithm, which is utilized for both classification and regression tasks.

    ~Training Set Prediction : 1.0     
    
    ~Testing Set Prediction :  0.7704918032786885

Random Forest:

Works based on Naive Bayes algorithm

    ~Training Set Prediction : 1.0
    
    ~Testing Set Prediction : 0.7704918032786885

LightGBM:

 Gradient Boosting Decision Tree (GBDT) algorithm with the addition of two novel techniques: Gradient-based One-Side Sampling (GOSS) and Exclusive Feature Bundling (EFB).

    ~Training Set Prediction : 0.9958677685950413
    
    ~Testing Set Prediction :0.7704918032786885

XGBoost:

 It builds a decision tree for a given boosting iteration, one level at a time, processing the entire dataset concurrently on the GPU.

    ~Training Set Prediction : 0.987603305785124
    
    ~Testing Set Prediction : 0.7540983606557377

CONCLUDED MODEL FROM PREDICTION:

  From the prediction results **Logistic Regression** is more suitable for both training and testing data .

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
Heart_Disease_Prediction.ipynb		Heart_Disease_Prediction.ipynb
README.md		README.md
cleveland_Heart_Data.csv		cleveland_Heart_Data.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Heart Disease Prediction

About

Languages

Shangamesh2805/HeartDiseasePrediction

Folders and files

Latest commit

History

Repository files navigation

Heart Disease Prediction

About

Topics

Resources

Stars

Watchers

Forks

Languages