Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use GPU? #1

Open
kinchi22 opened this issue Apr 12, 2016 · 2 comments
Open

How to use GPU? #1

kinchi22 opened this issue Apr 12, 2016 · 2 comments

Comments

@kinchi22
Copy link

I modified your code 'test-word2veckeras.py' for word2vec training on GPU.
But GPU Utilization was 0% in nvidia-smi during whole training time.
Here is my training code.

import sys
import numpy as np
import gensim
from word2veckeras.word2veckeras import Word2VecKeras

input_file = 'test.txt'
sents=gensim.models.word2vec.LineSentence(input_file)

v_size=200
window=8
sg_v=1
topn=4

word2vec = Word2VecKeras(sents,hs=1,negative=0,sg=sg_v,size=v_size,window=8,iter=15)
print(word2vec.most_similar('the', topn=topn))
@niitsuma
Copy link
Owner

'test.txt' is quite small data. Plz try more large data.
And plz check ~/.theanorc is correct

@kinchi22
Copy link
Author

Thank you for your answer. I didn't know that I have to set the device, because I'm not theano user.
I used the command below to train on gpu.

THEANO_FLAGS=mode=FAST_RUN,device=gpu,floatX=float32 python test-word2veckeras.py

But still it's too slow compared with C version of Google. And also result was little strange.
The train data has 100k sentences(each sentences has 19 words averagely), use CBOW model, 15 iterations, window size is 8 and vector size is 200. My machine is i7-2600 & Titan X.
Here is result of my test.

Using gpu device 0: GeForce GTX TITAN X (CNMeM is disabled, CuDNN 4004)
Using Theano backend.
train_batch_cbow
train_batch_cbow
Elapsed time 1540.953537 seconds
[(u'against', 0.9976166486740112), (u'other', 0.9975610375404358), (u'another', 0.9974160194396973), (u'most', 0.9969826936721802)]

The last line is most similar words of China.
Took only 60 seconds in C version of Google and most similar words were United States, export market, Chinese.

Do you think that my test is something wrong? Or do you have some measured benchmark for reference?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants