Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training/Building grammar for a language different than English #5

Open
kk00ss opened this issue Mar 25, 2015 · 5 comments
Open

Training/Building grammar for a language different than English #5

kk00ss opened this issue Mar 25, 2015 · 5 comments

Comments

@kk00ss
Copy link

kk00ss commented Mar 25, 2015

Considering that :
"We have provided the cascade of grammars used in the Berkeley Parser for English."
Is there a way to obtain grammars for other languages for which the grammar is already created ?
I've downloaded Berkeley Parser grammars, is there a way to obtain a list of grammars for Puck ?
Thanks

@dlwh
Copy link
Owner

dlwh commented Mar 29, 2015

There's only english right now.

On Wed, Mar 25, 2015 at 12:35 PM, kk00ss [email protected] wrote:

Considering that :
"We have provided the cascade of grammars used in the Berkeley Parser for
English."
Is there a way to obtain grammars for other languages for which the
grammar is already created ?
I've downloaded Berkeley Parser grammars, is there a way to obtain a list
of grammars for Puck ?
Thanks


Reply to this email directly or view it on GitHub
#5.

@JimSEOW
Copy link

JimSEOW commented Aug 24, 2016

can we use the e.g. German grammar (ger_sm5.gr) provided by Berkeley Parser for puck? - after converted to text format?

It seems to work except these files

num.binary
num.unary
numstates
unary

Could you provide instruction on how to create these files based on the converted text files?

@dlwh
Copy link
Owner

dlwh commented Aug 24, 2016

i think it won't work all that well on unknown/rare words because of the
way I did the lexicon, but otherwise it should. If it basically works, I
can help with getting the lexicon patched in.

On Wed, Aug 24, 2016 at 1:55 AM, JimSw2016 [email protected] wrote:

can we use the e.g. German grammar (ger_sm5.gr) provided by Berkeley
Parser for puck? - after converted to text format?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#5 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAAloSQM1K_PrprRR-qeMi-__9SKsHJqks5qjAcBgaJpZM4D0wXP
.

@JimSEOW
Copy link

JimSEOW commented Aug 24, 2016

HI David,
I have compared the extracted text files of
(a) ger_sm5.grammar -> same format as Puck's wsj_2.gr.binary
(b) ger_sm5.lexicon-> same format as Puck's wsj_2.gr.lexicon
(c) ger_sm5.splits-> same format as Puck's wsj_2.gr.hierarchy
(d) ger_sm5.words -> same format as Puck's wsj_2.gr.words

If you have time, do consider creating the missing
num.binary
num.unary
numstates
unary

If this process works, this will create NEW POSSIBILITIES for BerkeleyParser communities through GPU which you pioneered.

@philippzentner
Copy link

Did anything happen here? Does it work for German now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants