Skip to content

Latest commit

 

History

History
44 lines (30 loc) · 1.56 KB

README.md

File metadata and controls

44 lines (30 loc) · 1.56 KB

Visualizing Big Datasets: Tools, Pitfalls, Experimental Example

Talk about Data visualization in Science. Based on my experience with ratCAVE project and suggested approaches in Python I created a talk for my fellow MSNE students. The talk covers main problems with use of scatter plot for big, convolved data and explains how to address it.

Summary:

What should we keep in mind, when working with big datasets? In case of Scatter plots - 3 hyperparameters:

  • overplotting - avoid obscuring the data
  • saturation - look howmany points overlapping cause saturation of intensity points
  • undersampling - taking a subset might not be an answer

Or instead you can work with Heatmaps and remember to address following problems (1 hyperparameter):

  • undersaturation
  • pick the color map in accordance to the

Talk explains how to get from left to right: impretable visualization of datasets.


Presented on 01.06.2018 at the retreat for Master of Science in Neuroengineering students.

Installing

To run jupyter notebook as slides I used:

The talk was based on the use of:

  • pandas
  • seaborn
  • datashader

Acknowledgments