Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Input type #8

Open
xsway opened this issue Nov 2, 2017 · 2 comments
Open

Input type #8

xsway opened this issue Nov 2, 2017 · 2 comments

Comments

@xsway
Copy link

xsway commented Nov 2, 2017

Hi!

I was considering using your parser to parse some wiki corpora. A quick question: what is the type of input for pre-trained models? Is it possible to give to your parser raw text and get the whole pipeline (tokenization, tagging, parsing) running, or do you require the pre-processed conll-style input with POS tags?

Thanks!

@msklvsk
Copy link

msklvsk commented Nov 18, 2017

You have to pre-spit into sentences and tokens with UDPipe. That's what Stanford did for this parser:

screen shot 2017-11-18 at 13 27 38

That is, the input should be CoNLL-U-formatted. This parser/tagger will fill the corresponding columns in CoNLL-U.

@Vimos
Copy link

Vimos commented Mar 5, 2019

This better goes to the Readme.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants