A list of available pre-trained neural network, in particular caffe and chainer models. Like model zoo of caffe, but this repository aims to offer you framework-free collections of the latest paper implements.
Recently many deep-learning frameworks exist such as caffe, theano, chainer, tensorflow, keras and so on. However, each pre-trained models are sometimes not compatible each other (i.e. cannot be loaded with other frameworks). This kind of exclusion of models for specific frameworks might prevent open improvement for deep learning communities including students, academic fields, and industries.
For the users who haven't installed caffe, I made up some files where weight and bias are saved as pickle. You can re-construct neural networks of your own framework, if you know the model structure. See also Loading Caffe model without Caffe
- Please be careful about the LICENSE for each model of your own usage.
- This is in progress and we need your help! Thank you.
If you want to inform and add the new pre-trained models available, feel free for Issues and PRs. Also if you have installed caffe and want to contribute, please see Caffe model weight and bias export for non-caffe users.
- Add more models and implementations
- List the LICENSE for each model
- Scripts for chainer2pkl
Models are sorted for each tasks or competitions.
- ResNet
- Original paper "Deep Residual Learning for Image Recognition" by Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun
- Caffe model and prototxt uploaded by @KaimingHe. MIT LICENSE. For the details see https://github.com/KaimingHe/deep-residual-networks.
- Exported pkl You can know the structure of model by reading prototxt above. Only ResNet50 is available for now. MIT LICENSE. Exported by @shiba24
- Fully Convolutional Network
- Original paper "Fully Convolutional Networks for Semantic Segmentation" by Jonathan Long, Evan Shelhamer, Trevor Darrell
- Caffe model from https://github.com/shelhamer/fcn.berkeleyvision.org/tree/master/voc-fcn8s. However deploy.prototxt is incorrect. Please
wget https://raw.githubusercontent.com/shelhamer/fcn.berkeleyvision.org/master/voc-fcn8s/deploy.prototxt
for prototxt. - Exported pkl You can know the structure of model by reading prototxt above. Only ResNet50 is available for now. MIT LICENSE. Exported by @shiba24
- Fully Convolutional Network
- now in prep
- AlexNet
- Original paper
- Chainer implementation by @mitmul. GPL v2 LICENSE.
- Chainer model trained by @shiba24. GPL v2 LICENSE. Trained epoch = 90 and the validation loss = 0.0187835. This file is .npy format so you may read without chainer (though not tested yet).
- ResNet50
- Chainer model trained by @shiba24. GPL v2 LICENSE. Trained epoch = 40 and the validation loss = 0.00374945877117. This file is .npy format so you may read without chainer (though not tested yet).
See the Caffe documentation for loading caffe models with caffe.
There are few ways for loading caffe model without caffe. Currently my best practices are two below:
Chainer is compatible to Caffe model to some extent. Please see Caffe function in Chainer
Chainer doesn't support yet some functions and layers (e.g. Deconvolution layer) and might output some error messages. In that case, it might be possible to [load pkls](#Loading pkl) exported from caffe. The pkl file should be exported by those who DO have caffe! (To caffe users: please see Caffe model weight and bias export for non-caffe users)
See the Chainer documentation for loading chainer model with chainer. See also how to copy model to another model.
Currently the best practice might be installing chainer on your own environment because of its easy installation.
pip install chainer
For more details, please see chainer installation guide.
Now I am implementing chainer2pkl.py
for exporting W and b of chainer model. Please wait for a while.
See the Theano documentation for loading theano models with theano.
Now in prep
If you want to use caffe pre-trained models in chainer or keras, please see Loading pkl. You need .pkl file exported or downloaded beforehand.
Recently the framework that has plenty of pre-trained models is Caffe, particularly in the academic papers.
However, the installation of caffe is a bit complicated. Those who use other frameworks might not want to install caffe only for the pre-trained models. This kind of exclusive possession of models for specific frameworks prevent open improvement for deep learning communities including students, academic fields, and industries.
So, I wrote caffe2pkl.py
. This script works only caffe installed evironment and makes pickles of weights and biases.
python caffe2pkl.py --prototxt PATH_TO_PROTOTXT --model PATH_TO_MODEL
In python,
with open(PATH_TO_PKL, 'rb') as d_pickle:
data = six.moves.cPickle.load(d_pickle)
data
is dict and has *_W and *_b keys for each layer. W ans b is np.ndarray.
See the function copy_model
in this web site
Although this is in Japanese, but you will find the function very useful. (No need to understanding the Japanese explanation exactly!)