fix incorrect vocabs when loading from utf-8 binary #38

thangntt2 · 2016-04-18T10:20:38Z

Hi. I'm very grateful for your work in porting Word2vec to Java and it's a great job, I think. But I have a trouble when parse UTF-8 binary file (from Vietnamese corpus), the Vocabs was incorrect. So I fixed it by some ugly line of code :D. Review it and turn it to your clean code.
P/S If you wanna get an UTF-8 binary example, email me at [email protected] and sorry for my Eng :D
I look forward to hearing from you.

woidda · 2016-04-28T12:36:09Z

should be also fixed in #34

fix incorrect vocabs when loading from utf-8 binary

1118c77

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix incorrect vocabs when loading from utf-8 binary #38

fix incorrect vocabs when loading from utf-8 binary #38

thangntt2 commented Apr 18, 2016

woidda commented Apr 28, 2016

fix incorrect vocabs when loading from utf-8 binary #38

Are you sure you want to change the base?

fix incorrect vocabs when loading from utf-8 binary #38

Conversation

thangntt2 commented Apr 18, 2016

woidda commented Apr 28, 2016