Unsupervised Learning in R

Unsupervised machine learning is a class of algorithms that identifies patterns in unlabeled data, i.e. without considering an outcome or target. This workshop will describe and demonstrate powerful unsupervised learning algorithms used for clustering (hdbscan, latent class analysis, hopach), dimensionality reduction (umap, generalized low-rank models), and anomaly detection (isolation forests). Participants will learn how to structure unsupervised learning analyses and will gain familiarity with example code that can be adapted to their own projects.

Author: Chris Kennedy

Prerequisites

This is an intermediate machine learning workshop. Participants should have significant prior experience with R and RStudio, including manipulation of data frames, installation of packages, and plotting.

Prerequisite workshops

R Fundamentals or similar training in R basics.

Recommended workshops

Machine Learning in R or other supervised learning experience.

Technology requirements

Participants should have access to a computer with the following software:

R version 3.6 or greater
RStudio
RTools - if using Windows

Initial steps for participants

To prepare for the workshop, please download the materials and work through the package installation in 0-install.Rmd. Please report any errors to the GitHub issue queue.

There is also an RStudio Cloud workspace that can be used.

Reporting errors or giving feedback

Please create a GitHub issue to report any errors or give feedback on this workshop.

Resources

Books

Boemke & Greenwell (2019). Hands-on Machine Learning with R - free online version
Hennig et al. (2015). Handbook of Cluster Analysis - thorough and highly recommended
Aggarwal & Reddy. (2014). Data clustering: algorithms and applications - great complement to Hennig et al.
Dolnicar et al. (2018). Market segmentation analysis - free, closely tied to R, and chapter 7 is especially helpful
Izenman (2013). Modern Multivariate Statistical Techniques
Everitt et al. (2011). Cluster Analysis

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
R		R
data-raw		data-raw
data		data
docs		docs
solutions		solutions
.gitignore		.gitignore
0-install.Rmd		0-install.Rmd
1-clean-data.Rmd		1-clean-data.Rmd
2-hdbscan.Rmd		2-hdbscan.Rmd
3-umap.Rmd		3-umap.Rmd
4-glrm.Rmd		4-glrm.Rmd
5-lca.Rmd		5-lca.Rmd
6-hopach.Rmd		6-hopach.Rmd
7-isolation-forests.Rmd		7-isolation-forests.Rmd
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Unsupervised Learning in R

Prerequisites

Technology requirements

Initial steps for participants

Reporting errors or giving feedback

Resources

About

Releases

Packages

Languages

License

dlab-berkeley/Unsupervised-Learning-in-R

Folders and files

Latest commit

History

Repository files navigation

Unsupervised Learning in R

Prerequisites

Technology requirements

Initial steps for participants

Reporting errors or giving feedback

Resources

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages