Skip to content

Jupyter interactive presenation "Visualizing Big Datasets: Tools, Pitfalls, Experimental Example"

Notifications You must be signed in to change notification settings

alTeska/data_visualization_talk

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Visualizing Big Datasets: Tools, Pitfalls, Experimental Example

Talk about Data visualization in Science. Based on my experience with ratCAVE project and suggested approaches in Python I created a talk for my fellow MSNE students. The talk covers main problems with use of scatter plot for big, convolved data and explains how to address it.

Summary:

What should we keep in mind, when working with big datasets? In case of Scatter plots - 3 hyperparameters:

  • overplotting - avoid obscuring the data
  • saturation - look howmany points overlapping cause saturation of intensity points
  • undersampling - taking a subset might not be an answer

Or instead you can work with Heatmaps and remember to address following problems (1 hyperparameter):

  • undersaturation
  • pick the color map in accordance to the

Talk explains how to get from left to right: impretable visualization of datasets.


Presented on 01.06.2018 at the retreat for Master of Science in Neuroengineering students.

Installing

To run jupyter notebook as slides I used:

The talk was based on the use of:

  • pandas
  • seaborn
  • datashader

Acknowledgments

About

Jupyter interactive presenation "Visualizing Big Datasets: Tools, Pitfalls, Experimental Example"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published