-
Notifications
You must be signed in to change notification settings - Fork 137
Problems of generating Corpus file #23
Comments
what language are you trying? |
Hello, We are trying English wikipedia. Thank you very much |
Hello, We run the commands in prepare.sh manually and we get the corpus file successfully. We are currently train model using the corpus file, the massage we got from the command: ... and the program stays there for several hours, but the CPU usage is full. We are wondering whether the program is running correctly and shall we wait until we get the results? Thank you very much |
ZH, depending on the corpus size + number of dimensions, method(skipgram, cbow) Be aware that if you installed gensim manually, it might not be using all the cores. The first stage of word2vec will only use a single core tho (gathering the vocabulary), the batches of matrix factorization are done in parallel using as many cores as possible. |
Hello, We use the command "wiki2vec.sh corpus output/model.w2c 50 500 10" to generate model file, after program runs for 20 hours, we get error message "IOError: [Errno 2] No such file or directory: '/home/_/_/wiki2vec/wiki2vec-master/results/model.w2c.syn1neg.npy'". Could you please give us some suggestions about how to solve the problem? Thank you very much. |
Hi, @zhq2009 was this issue ever resolved? |
Hello,
Yes, the problem was solved.
Thank you very much.
…On Sat, Dec 31, 2016 at 4:14 AM, Rishab Gargeya ***@***.***> wrote:
Hi, @zhq2009 <https://github.com/zhq2009> was this issue ever resolved?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#23 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ARKSjKEJNfE-sArEe0yNTJCe3iLEUqH1ks5rNhzfgaJpZM4JWUET>
.
|
Hi, I'm having the same problem when I try to generate the Corpus file - the file keeps coming up empty. I'm running the following command: sudo sh prepare.sh en_US ~/data Do you know why this might be? Thank you! |
Hi, I am also facing the same issue. When I ran the following snippet from gensim.models import Word2Vec " array.shape = shape Next when I run "sudo sh prepare.sh en_US ~/data", the corpus file is empty. |
Hello,
We are using prepare.sh to generate Corpus file, but the Corpus file we generate is empty, could you please give us some suggestion of how to solve the problem?
Thank you very much
The text was updated successfully, but these errors were encountered: