You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
Thanks for your great dataset which definitely speeds up scientific research!
As a fan and user of your dataset, I was really curious how do you guys deal with copyright issues?
Do you have the right to distribute the submitted articles?
As a user of the dataset, may I have the redistribution right? For example, if I do another process step designed for some research tasks based on your dataset, could I distribute it to other people?
Thanks!
The text was updated successfully, but these errors were encountered:
Hi @shizhediao, we already discussed this over email; just copying my response here for others:
Copyright is pretty tricky! We consulted with a lawyer about this for a long time, and ultimately decided that releasing this under CC BY-NC 2.0 https://github.com/allenai/s2orc/blob/master/README.md#license is safe. There are a variety of factors in our favor here: We're only releasing full text data that's derived from open-access papers. We're only allowing S2ORC for non-commercial use. And the S2ORC text isn't really usable for direct consumption of the papers (i.e. reading the paper like a PDF) and doesn't contain a lot of the content necessary to read the paper (e.g. visual layout, figures, etc.), so can likely argue that this falls under fair use for research.
Please take a look at the license which should explain what you can/can't do with S2ORC & derivations with respect to redistribution. In short, yes, what we're hoping for is researchers will use S2ORC as a "meta" corpus to derive further task-specific NLP datasets that they can distribute.
Hi,
Thanks for your great dataset which definitely speeds up scientific research!
As a fan and user of your dataset, I was really curious how do you guys deal with copyright issues?
Thanks!
The text was updated successfully, but these errors were encountered: