Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the twentybn data really 773GB? #1

Open
ana-GT opened this issue Oct 18, 2023 · 2 comments
Open

Is the twentybn data really 773GB? #1

ana-GT opened this issue Oct 18, 2023 · 2 comments

Comments

@ana-GT
Copy link

ana-GT commented Oct 18, 2023

Hi, I was trying to give a shot to this code after reading the Grounding Predicates to Actions paper. I was surprised to see that the data for the 20BN is ~773GB in Google Drive. As I took a look at the 20BN data itself from its original page, I see that it amounts to < 30GB, so I was curious on what additional data could be in the Drive link you guys kindly mention in the package.

My hard drive does not have enough space for all this data 😅 , so I am asking just to check if all this data is really necessary. Thanks!

Ana

@Bailey-24
Copy link

yes, could you please release the checkpoint

@ana-GT
Copy link
Author

ana-GT commented Dec 7, 2023

Hi again. Just a comment: I tried several time to download the dataset, but its sheer size makes it impossible (the download eventually fails with a 403 error. I've gotten close to downloading up to ~750 GB only for the process to fail). If it were possible, to release a sub-section of the full data (or just divided in smaller size items). It would very much be appreciated.

Thanks a lot Toki! And thanks for making your work publicly available.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants