Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Train, dev, test split for the data #3

Open
BillyZhang24kobe opened this issue Oct 13, 2023 · 1 comment
Open

Train, dev, test split for the data #3

BillyZhang24kobe opened this issue Oct 13, 2023 · 1 comment

Comments

@BillyZhang24kobe
Copy link

Hello,

On the paper of this dataset you mentioned the data is split into several train, dev and test splits. I am wondering if you have some documentations on how exactly the splits are? I have downloaded the dataset from the official website (https://geodiverse-data-collection.cs.princeton.edu/), but it seems that there is only an 'index.csv' as a metadata file, which does not specify how the train-val-test data is split. Any pointers are welcomed! Thanks!

@hassony2
Copy link

hassony2 commented Nov 22, 2023

Hi @BillyZhang24kobe,

I am also looking into this :)
It looks like the splits are defined in load_data.
If my understanding is correct, to report numbers which would be comparable to Table 6, we need to use prep_geode_38 to generate the different per-region files, using 'index.csv' in place of the metadata file, using 'object' and 'file_path' instead of the 'script_name' and 'file_name' fields.

@vramaswamy94, thank you for contributing such a nice dataset :)
Would you be able to confirm if my understanding is correct ? It would be great if you could provide the generated region-specific pickle files to avoid any risks of using a different train/val/test partition compared to your paper. Would you be able to share these ?

Have a great day !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants