Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openCL branch of caffe reports much higher speeds #12

Open
Motherboard opened this issue Oct 7, 2016 · 3 comments
Open

openCL branch of caffe reports much higher speeds #12

Motherboard opened this issue Oct 7, 2016 · 3 comments

Comments

@Motherboard
Copy link

on OpenCL-caffe, there are performance matrices claiming speeds of about 4ms per image for training AlexNet with Radeon R290X, Considering this GPU is much weaker than a GTX 1080, these figures seem very weird compared with the 20ms in your tests.

What's your take on this?

@jcjohnson
Copy link
Owner

My speeds are forward / backward times for an entire minibatch of 16 images; they divide by the minibatch size to try and compute a per-image time. You need to divide my times by 16 to be comparable to theirs, at which point the GTX 1080 is significantly faster than their R290X times.

Another subtle issue is that they use a minibatch size of 128, while I used a minibatch size of 16 for a fair comparison across all models. Since AlexNet is a small model and GPUs are massively parallel, I'd expect the per-image time to decrease as the batch size increases, which gives their benchmark a slight advantage.

@gbrand-salesforce
Copy link

Thanks, this clears it up :) On an unrelated note, I'd really love to see benchmarks of SqueezeNet 1.1, which should be much faster than all of these networks.

@shashikale
Copy link

Hello,
First of all its great work now i am able to understand torch framework and benchmark till some extend.
Im trying to see the difference between running on gpu and cpu.
I built the torch and my desktop spec are intel i7 with Pascal Titan-X.
I was able to run on GPU but as far as running on CPU when i issue the command
python run_cnn_benchmarks.py --gpus -1 --models Torch_Ref/distro/cnn_bench/cnn-benchmarks/models/alexnet/alexnet.t7 --batch_sizes 1 --use_cudnn 1

Im getting following error "
re/lua/5.2/torch/File.lua:343: unknown Torch class <cudnn.SpatialConvolution>"

May i missing something here while running on CPU only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants