You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just released a Ruby module that builds an index of a binary word2vec vector
file, so your code can seek directly to the right position in the file for a
given word or term. For example, the word "/en/italy" in the English
"freebase-vectors-skipgram1000-en.bin" file is at byte position 116414 position.
The module also computes a locally-sensitive hash for each vector in a binary
word2vec file, so you can do a nearest neighbor search (i.e. cosine distance)
much faster. I get a couple orders of magnitude better performance on my
machine, with a 10 bit random projection LSH.
https://github.com/someben/treebank/blob/master/src/build_word2vec_index.rb
Thanks for the project, Tomas.
Best,
Ben
Original issue reported on code.google.com by [email protected] on 23 Sep 2013 at 3:48
The text was updated successfully, but these errors were encountered:
Sounds cool, thanks for sharing your code! By the way, there is a discussion
forum related to word2vec that might be more suitable for this type of post:
https://groups.google.com/forum/#!forum/word2vec-toolkit
It might be easier for the others to find your post there.
Best,
Tomas
Original issue reported on code.google.com by
[email protected]
on 23 Sep 2013 at 3:48The text was updated successfully, but these errors were encountered: