Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get table content #27

Open
Sunnycheey opened this issue Oct 9, 2020 · 1 comment
Open

Get table content #27

Sunnycheey opened this issue Oct 9, 2020 · 1 comment

Comments

@Sunnycheey
Copy link

I want to know why you remove the table content while processing since the table content is structured and important in many situtation.

@kyleclo
Copy link
Collaborator

kyleclo commented Nov 13, 2020

Hey @Sunnycheey, we decided that the quality of the tables was too low for practical usage & we decided not to include it as part of the release. We've been since working on how to improve table extraction so we that we might include it in future S2ORC releases. If you're looking for a S2ORC-like dataset that includes higher quality tables, you can check out https://github.com/allenai/cord19 in which we used IBM Research's table parsing software.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants