-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Correct way for data split #4
Comments
@hsujh073 I believe 21-22 should be the development set and 23+ are the test set as written in the paper: @FrankLicm could you please verify this and fix the typos if any? Thanks. |
Hi, I am sorry that I think I may make a mistake when naming the generated split files before so I actually forgot which set I used to get the result in the paper, but the correct way I originally proposed is indeed 21-22 should be the development set and 23+ are the test set. Besides, this data split is generated from full data for version 1.0 when uploading it to make it consistent with the version 1.0, and due to my previous laptop issue, I lost the original data split files when I did experiments for which I did some deletion of some invalid questions and the development environment for this now is also lost so I am afraid that I cannot do any further operations regarding this repo. The typo here, I think, is only that the name of dev and test files of both versions 1.0 and 2.0 should be exchanged. Thanks. |
OK. Thank you. |
@FrankLicm you still have the access to this repo, so please fix the names when you have time. Thanks! |
Hi.
I want to run some codes with FriendsQA dataset and find out that in the JSON file episode 21-22 are the test set and those after 23 are the development set, different from that written in README.
So which one is the correct way to split the dataset? Thanks.
The text was updated successfully, but these errors were encountered: