Correct way for data split #4

hsujh073 · 2020-08-21T08:13:17Z

Hi.
I want to run some codes with FriendsQA dataset and find out that in the JSON file episode 21-22 are the test set and those after 23 are the development set, different from that written in README.

So which one is the correct way to split the dataset? Thanks.

jdchoi77 · 2020-08-21T16:22:29Z

@hsujh073 I believe 21-22 should be the development set and 23+ are the test set as written in the paper:
https://www.aclweb.org/anthology/2020.acl-main.505.pdf

@FrankLicm could you please verify this and fix the typos if any? Thanks.

arianakc · 2020-08-22T08:15:23Z

Hi, I am sorry that I think I may make a mistake when naming the generated split files before so I actually forgot which set I used to get the result in the paper, but the correct way I originally proposed is indeed 21-22 should be the development set and 23+ are the test set. Besides, this data split is generated from full data for version 1.0 when uploading it to make it consistent with the version 1.0, and due to my previous laptop issue, I lost the original data split files when I did experiments for which I did some deletion of some invalid questions and the development environment for this now is also lost so I am afraid that I cannot do any further operations regarding this repo. The typo here, I think, is only that the name of dev and test files of both versions 1.0 and 2.0 should be exchanged. Thanks.

hsujh073 · 2020-08-23T09:15:14Z

OK. Thank you.

jdchoi77 · 2020-08-24T15:38:43Z

@FrankLicm you still have the access to this repo, so please fix the names when you have time. Thanks!

jdchoi77 assigned arianakc Aug 21, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correct way for data split #4

Correct way for data split #4

hsujh073 commented Aug 21, 2020

jdchoi77 commented Aug 21, 2020

arianakc commented Aug 22, 2020

hsujh073 commented Aug 23, 2020

jdchoi77 commented Aug 24, 2020

Correct way for data split #4

Correct way for data split #4

Comments

hsujh073 commented Aug 21, 2020

jdchoi77 commented Aug 21, 2020

arianakc commented Aug 22, 2020

hsujh073 commented Aug 23, 2020

jdchoi77 commented Aug 24, 2020