Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to allocate GPU wth E2E method #86

Open
ghost opened this issue Mar 4, 2020 · 6 comments
Open

How to allocate GPU wth E2E method #86

ghost opened this issue Mar 4, 2020 · 6 comments

Comments

@ghost
Copy link

ghost commented Mar 4, 2020

Hi, I'm trying to run E2E method on GPU. While I noticed that code requires TensorFlow 1.0.0. How can I run it on GPU? I have already set the environment GPU=0 while it seems there is no tensorflow-GPU to allocate it.

@Damcy
Copy link

Damcy commented Apr 21, 2020

@Byron309 I can run the code with Tensorflow-gpu 1.14. Before running the train script, I set export GPU=0 and export CUDA_VISIBLE_DEVICES=0.

@ghost
Copy link
Author

ghost commented May 11, 2020

Hi, @Damcy
Thank you for your reply.

I try tf-gpu==1.14 and I have run setup_all.sh. but I get the error "tensorflow.python.framework.errors_impl.NotFoundError: ./coref_kernels.so: undefined symbol: _ZTIN10tensorflow8OpKernelE" Did you meet this problem?

I'm trying the E2E method and not the high-order method.

@Damcy
Copy link

Damcy commented May 11, 2020

The cmd in setup_all.sh may be out-of-date.
You can change
g++ -std=c++11 -shared coref_kernels.cc -o coref_kernels.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 -D_GLIBCXX_USE_CXX11_ABI=0
into
g++ -std=c++11 -shared coref_kernels.cc -o coref_kernels.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2
if your gcc version is higher than 4.8.3 (I guess).
Then it can generate a correct .so file.

@ghost
Copy link
Author

ghost commented May 11, 2020

Hi, @Damcy
Thanks for your help. But I didn't find the command g++ -std=c++11 -shared coref_kernels.cc -o coref_kernels.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 -D_GLIBCXX_USE_CXX11_ABI=0 in the setup_all.sh.

Also I try the command g++ -std=c++11 -shared coref_kernels.cc -o coref_kernels.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 , but get another error:

coref_kernels.cc:4:10: fatal error: tensorflow/core/framework/op.h: No such file or directory
 #include "tensorflow/core/framework/op.h"
          ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.

Have you met this error before?

@Damcy
Copy link

Damcy commented May 11, 2020

g++ -std=c++11 -shared coref_kernels.cc -o coref_kernels.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 -D_GLIBCXX_USE_CXX11_ABI=0 is line 13 in the setup_all.sh
I didn't meet this error before. I think you can make some modifications from setup_all.sh in the high-order repo.

@ghost
Copy link
Author

ghost commented May 11, 2020

It works! thank you!

Somehow the setup_all.sh file we are talking about is different. I try the file you share and it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant