Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AI4ArcticSeaIce dataset. #2528

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

nilsleh
Copy link
Collaborator

@nilsleh nilsleh commented Jan 23, 2025

This PR adds the AI4ArcticSeaIce dataset. Rehosted to HF for faster download times (couple minutes vs ~8hours). In the original repo there is some additional useful information.

The Sea Ice Challenge Dataset contains Sentinel-1 SAR imagery, passive microwave radiometer observations
    from AMSR2, and numerical weather prediction data from the ECMWF Reanalysis v5 (ERA5) dataset - all
    gridded to match the Sentinel-1 SAR scenes geometrically. As label data, the dataset contains ice charts
    manually produced by the ice analysts at the Greenland Ice Service and the Canadian Ice Service.

Dataset features:

    * Dual-polarization SAR (HH, HV) imagery for each patch.
    * Sea Ice Concentration (SIC): the percentage ratio of sea ice to open water for an area,
        discretized into 11 10% bins ranging from 0% to 100%.
    * Stage Of Development (SOD): type of sea ice, as proxy for ice thickness and
        ease of traversing with 6 classes
    * Floe size (FLOE): Classifying or segmenting distinct ice floes based on size, shape,
          or other geometric properties.

Dataset format:

    * each sample scene is stored in a separate .nc file
    * pixel dimension of varying sizes up to ~5000pxx5000px
    * 80m resolution

TODOS:

  • check plotting again and improve category plotting across the different targets
  • think about resizing since individual tiles can be really large

example_sea_ice

Thanks @astokholm for the creation and open-sourcing dataset. This dataset has some complexities due to all the different data modalities, so if you have any comments/corrections, it would be much appreciated.

@nilsleh nilsleh marked this pull request as draft January 23, 2025 19:55
@github-actions github-actions bot added documentation Improvements or additions to documentation datasets Geospatial or benchmark datasets testing Continuous integration testing labels Jan 23, 2025
@nilsleh nilsleh added this to the 0.7.0 milestone Jan 23, 2025
@khdlr
Copy link
Contributor

khdlr commented Jan 23, 2025

This is the v2 dataset, right? (v1 is still around: https://data.dtu.dk/collections/AI4Arctic_Sea_Ice_Challenge_Dataset/6244065/1). Might be helpful to document this in the dataset class.

@nilsleh
Copy link
Collaborator Author

nilsleh commented Jan 24, 2025

This is the v2 dataset, right? (v1 is still around: https://data.dtu.dk/collections/AI4Arctic_Sea_Ice_Challenge_Dataset/6244065/1). Might be helpful to document this in the dataset class.

Good catch, it is actually Version 3, I corrected the link above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datasets Geospatial or benchmark datasets documentation Improvements or additions to documentation testing Continuous integration testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants