Skip to content
This repository was archived by the owner on Jun 2, 2025. It is now read-only.
This repository was archived by the owner on Jun 2, 2025. It is now read-only.

Add Visualization Options  #231

Closed as not planned
Closed as not planned
@jacobbieker

Description

@jacobbieker

Detailed Description

We want to be able to easily see what our batches look like and have utilities that plot them to help with debugging and ensuring that our pipelines are doing what we expect.

We have had multiple one-off visualization scripts before, but the goal of this is to build them into datapipes, and ideally keep them up to date, and possibly run them on PRs to give a quick, automatic view if any of the datapipes are changed or updated.

I think the steps would be

  • Make visualization module in datapipes
  • Add visualizing a whole example of all modalities as an image
  • Add visualizing examples as little videos (to see the timeseries in the videos)
  • Add option to save out batches in more interpretable format (i.e. NetCDF or something that keeps coordinates and the like, vs PyTorch tensors)

Possible Implementation

Satip used to have a step in the workflows that ran visualization code of the outputs of some processing steps on PRs, it was quite helpful to know if changes broke end-to-end processing pipelines, and for the images coming out still looked correct.

Notes

Goal:

  • to show what is in the batches right before the model runs
  • To show in training what is going in at any timestep
  • User can step through periods
  • Time and space is aligned
    Users:
  • ML team only
  • Prototype examples
  • NWP data wasn’t aligned with GSP data - James found this when plotting these out
  • Early on, Jacob found the satellite data was 500 km off
    Effort to build:
  • Make it so people don’t need to rebuild anything from scratch
  • Build something a bit less ad-hoc than before
    Effort to run:
  • Hopefully takes someone <1 min to run this from Datapipes
  • It would be useful for training & production use cases

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions