Skip to content

Constrained changepoint GUI

Toby Dylan Hocking edited this page Mar 26, 2020 · 14 revisions

Background

Change point detection (CPD) is the problem of finding abrupt changes in data when a property of the time series changes. In other words, a changepoint is a sample of dataset in time where the statistical properties before and after this time point differ. Change point detection in the machine learning field is applied in the segmentation, edge detection, event detection and anomaly detection concepts. One example is from the field of neuroscience, in which detecting neural spikes from noisy measurements of calcium imaging data is important (Figure 1). In these data, it is important to recognize abrupt “up” changes, which represent discrete spikes of neural activity.

Another application is in electrocardiogram (ECG) data analysis, which is important for arrhythmia diagnosis (Figure 2). In these data, an important sub-problem is recognizing the QRS complex, which is a sequence of “down/up/down” changes that occurs in the ECG data during each heartbeat.

We have proposed a graph-constrained changepoint modeling framework [1] along with an R package (gfpop) that provides efficient and optimal inference of the mean, changepoint, and hidden state parameters of such models [2]. In this framework, the user must specify a graph which represents the expected/desired sequence of states/changes in the model. Two consecutive distinct states are distinguished by a change point. Nodes in the graph represent hidden states, and edges represent possible changes between states. Different graphs result in models of different quality and changepoint detection accuracy for a given data set. Currently the user is required to specify the graph in R code.

Related work

gfpop is the only R package that implements constrained changepoint models, but there is no GUI.

Details of your coding project

In this project, we propose to develop gfpopGUI, a graphical user interface (GUI) and web interface to the gfpop algorithm for graph-constrained changepoint detection. The GUI will allow users to specify the graph visually, instead of in R code, which will make it much easier for users to iterate and try different graphs.

In particular the minimal set of features for the gfpopGUI system are three linked displays:

  1. Visualization of the current graph, with possibility to add/edit nodes/edges. For me the whole point of the GUI is to make it easy to create the graph using visual nodes/edges, rather than by editing a data table (which is already possible in R code). Can you think about a way to achieve that? I'm thinking about something like this https://bl.ocks.org/cjrd/6863459
  2. Overview visualization of an entire data set along with the optimal model for a given graph.
  3. Details visualization of zoomed/scrolled subset of data.

Additional features include

  1. A public web server (e.g. gfpopGUI.shinyapps.io) where anyone can use gfpopGUI.
  2. Features for adding/editing labels for desired changes/states in the current data set, and computing the number of incorrectly predicted labels for the current graph/model.
  3. Linking between the plot of the data set and the plot of the graph: hovering/clicking a node/edge in the graph should highlight all segments/changes in the plot of the data (and vice versa).
  • when you hover the pointer over a node or edge in the graph, we should see a highlight of the corresponding segments/changepoints in the data.
  • when you hover the pointer over a segment or changepoint in the data, we should see a highlight of the corresponding nodes/edges in the graph.
  1. Features for saving user-specific models and sharing them with others.
  2. A web API for downloading data sets, labels, models.

Expected impact

This project will produce a public web application which will be used by researchers in several different fields which involve sequence data analysis (e.g. neuroscience, medical monitoring, genomics).

Mentors

Contact mentors below after completing at least one of the tests below.

  • EVALUATING MENTOR: Guillem Rigaill [email protected] is an expert in optimal changepoint algos.
  • Toby Hocking [email protected] is an author of gfpop and other packages for optimal changepoint detection, and GSOC mentor since 2013.

Tests

Students, please do one or more of the following tests before contacting the mentors above.

  • Easy: download the gfpop package, run the code in the vignette, change the penalty parameter, and make a multi-panel ggplot that shows how the model changes as the penalty parameter is varied (one panel for each penalty parameter value).
  • Medium: make a shiny app with an input that allows you to select the penalty parameter in that data set, and shows a ggplot of the data and model with that penalty parameter.
  • Hard: write a D3.js data visualization in which you can hover over one displayed item, and see it highlighted, along with other items.

Solutions of tests

Students, please post a link to your test results here.

Clone this wiki locally