-
Notifications
You must be signed in to change notification settings - Fork 1
Issues: DavidNemeskey/cc_corpus
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Add language filter to filter_warc.py
enhancement
New feature or request
#26
opened Oct 14, 2022 by
DavidNemeskey
Conversion to fastText format
enhancement
New feature or request
#20
opened Apr 21, 2021 by
DavidNemeskey
Same-document paragraph removal too eager
bug
Something isn't working
#14
opened Apr 16, 2020 by
DavidNemeskey
Corpus scripts to work on the CoNLL-U+ format
enhancement
New feature or request
#12
opened Mar 31, 2020 by
DavidNemeskey
Get rid of bootstrapping
enhancement
New feature or request
#11
opened Mar 31, 2020 by
DavidNemeskey
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.