Skip to content

Performed analysis of Heart Disease Diagnostic database with help of data visualization libraries like Seaborn and Matplotlib to draw meaningful insights.

Notifications You must be signed in to change notification settings

pratyusha-garaye/Heart-Disease-Diagnostic-Analysis

Repository files navigation

Heart Disease Diagnostic Analysis

Software used

  • Python with ML Libraries (Pandas, Seaborn, Matplotlib)
  • VS Code developing environment

Features in the dataset

  • There are 13 features in the dataset. age: Person's age in years

sex: The person's sex (Male=1, Female=0)

cp: Chest pain experienced (typical angina=0, atypical angina=1, non-anginal =2, aymptomatic=3)

trestbps: The person's resting blood pressure (mm Hg)

chol: The person's cholestrol level (mg/dl)

fbs: The person's fasting blood sugar (true=1 is >120 mg/dl, else false=0)

restecg: Resting electrocardiographic measurement (normal=0, ST-T wave abnormality=1, showing probable/definite left ventricular hypertrophy)

thalach: The person's maximum heart rate achieved

exang: Exercise induced angina

oldpeak: ST depression induced by exercise relative to rest

slope: The slope of the peak exercise ST segment

ca: Number of major vessels colours by fluoroscopy

thal: 0 = normal; 1 = fixed defect; 2 = reversable defect

Description

  • Loaded the diagnosis data using Pandas and performed EDA.
Headers
  • Added columns to turn categorical variables into string values.
  • Used Seaborn and Matplotlib libraries to visualize relationship between various attributes in the dataset.

BP:Age Age distribution Age-wise sex distribution

  • Plotted a heatmap of the correlation coefficients between variables to understand the effect of each attribute on deciding whether disease is present or absent.

heatmap of correlations

Conclusion

It was found that from the given the dataset, female patients were more likely to have the disease, and is most common in the age ranging between 29 and 54.

Disease diagnosis is the most positively affected for the following factors:

  • Exercise Induced Angina is negative, for 69.61% of the patients and this factor 3 times more likely to cause the disease.
  • Chest pain type is 1, for 89% of the patients having this category of chest pain and this factor 2.32 times more likely to cause the disease.
  • Number of major blood vessels type is 0, for 74.29% of the patients having this category of major blood vessels and this factor 2.13 times more likely to cause the disease.
  • Slope of the peak exercise ST segment type is 2, for 75.35% of the patients having this category of slope and this factor 2.09 times more likely to cause the disease.

About

Performed analysis of Heart Disease Diagnostic database with help of data visualization libraries like Seaborn and Matplotlib to draw meaningful insights.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published