Skip to content

Add basis interpolation methods to tourr package

BocongZhao823 edited this page Apr 11, 2022 · 10 revisions

Background

The tour is a technique to visualize high-dimensional data from a sequence of low-dimensional projections. It can be used to explore data for clusters, outliers and non-linear relationships. It is also for useful with modeling, to check assumptions such as multicollinearity, and for checking boundaries between classes in supervised classification problems.

Related work

This project will build on the existing tourr package. The current interpolation method for the grand tour, and parts of the other tours is geodesic. This interpolates between planes, by finding the shortest angle between them, and it ensures that all of the rotation in a tour path is out of the viewing plane.

Details of your coding project

This project is related to Issue 110 on the tourr GitHub repo. The task is to take existing R code for doing basis interpolation and integrate it as an alternative interpolation method for tour, writing a vignette explaining the interpolation methods and why and where to use each one.

Expected impact

This package is one of the few available for visualising high-dimensional data. It is used in many university courses on multivariate data, and machine learning. It is also reasonably broadly used in industry.

Mentors

Contributors, please contact mentors below after completing at least one of the tests below.

  • EVALUATING MENTOR: Ursula Laa [email protected] is the author of the slice tour and section pursuit code in the tourr package.
  • MENTOR: Di Cook [email protected] is the maintainer of the tourr package, and developer of several methods provided in the package. Di has co-mentored approximately 10 successful GSoC projects.

Tests

Contributors, please do one or more of the following tests before contacting the mentors above.

  • Easy: Make an animated gif of a grand tour for the four physical measurements of the penguins data in the palmerpenguins data.
  • Medium: Write a function that takes the penguins data and outputs a 2D projection, with the data and a projection matrix as the input, and the projected data as the output.
  • Hard: Create a small standalone package, that has the functionality of the medium task, complete with documentation, dependencies, and a short vignette.

Solutions of tests

Contributors, please post a link to your test results here.

  • EXAMPLE CONTRIBUTOR 1 NAME, LINK TO GITHUB PROFILE, LINK TO TEST RESULTS.