Skip to content

mmp2/manifold-learning-examples

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Welcome to the Manifold Learning Examples project!

Manifold Learning (ML) algorithms -- also called Embedding algorithms -- can help us interpret data with many dimensions (such as a cloud of word embeddings or of configurations of a molecule) by mapping it to 2D or to 3D where we can see it. But is what we are seeing the real shape of the data? Almost always, ML algorithms distort the shape. Sometimes the distortions are unimportant, but sometimes they can make us see clusters, "arms", holes, and "horseshoes" (what we will call artefacts) that are not properties of the data, but just effects of the algorithm and parameter choices.

This project illustrates some of the most common effects and artefacts you will encounter, as you start using Embedding algorithms for your real data. The artefacts are not symptoms of "too little data" -- most of them persist even when the data size n goes to infinity! We chose simple artificial examples as the most common effects are present even with the simplest data.

The good news is that once you are aware of their presence, the artefacts and distorions can be recognized and methods exist to circumvent or to correct them.

What you will find on this site

How to use this site

Feel free to use the code, articles and graphics, citing this repository (please see sidebar About to obtain citation). Currently, this is a working repository; changes to the code or files are possible.

Contributors (in alphabetical order)

  • Haoqiang (Murray) Kang original repository creator, non-uniform density, aspect ratio, t-SNE
  • Marina Meila, Professor, concept and scientific leadership
  • Hangliang Ren (Harry), spectral embedding, non-uniform density, plotting, aspect ratio
  • Qirui Wang, UMAP, aspect ratio
  • Yujia Wu, data generation, plotting, Local Linear Embedding, aspect ratio
  • Shuzhen Zhang manifold learning explained, Riemannian metric, maps embeddings, site curator 2024

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages