I use unsupervised learning methods (PCA, hierarchical clustering, and k-means clustering) to subgroup 801 cancer samples into one of 5 clusters based on their gene expression profiles, 20, 531 gene expression measurements.
Code: cancerRNA_markdown.Rmd
Project Presentation: cancerRNA_clustering.pdf
Data Set
Source: UCI Machine learning repository.
There are 801 cancer samples, 5 types of cancer, each sample has 20,531 gene expression measurements as its features.