This is the code to practice knn and k-means with heart disease dataset.
I used sklearn modules like sklearn.neighbors.KNeighborsClassifier, sklearn.cluster.KMeans and etc.
Heart Disease Data
- Features(13): [ 'age', 'sex', 'cp', 'trestbps', 'chol', 'fbs', 'restecg', 'thalach', 'exang', 'oldpeak', 'slope', 'ca', 'thal']
- Target: diagnosis of heart disease (angiographic disease status)
- integer valued from 0 (no presence) to 4 -> binarization: 0 means 'No heart disease', 1 means 'heart disease'
- For more information, visit a webpage
The result is evaluating the preformance of the models.
To draw clusters in 2D, I used sklearn.decomposition.PCA to reduce dimension.
That is why the x and y axes are the principal component.
- python 3.7
- Used Modules - numpy, pandas, matplotlib, seaborn, scikit-learn
Detailed algorithm and experimental setup are explained in here.