The amount of parameters #14

shamangary · 2017-04-26T08:46:40Z

I use the following setting, as suggested in the github.
L=40,k=12, no bottleneck
However, the parameter number is not 1M, it's 0.6M.
This problem also happen when I turn bottelneck on. I got different parameter number than the reported one.
Please tell me where do I miss. Thank you.

Calling the model:

dn_opt = {}
dn_opt.depth = 40
dn_opt.dataset = 'cifar10'
model = paths.dofile('densenet.lua')(dn_opt)
model:cuda()
print(model:getParameters():size())

In densenet.lua

local growthRate = 12

    --dropout rate, set it to 0 to disable dropout, non-zero number to enable dropout and set drop rate
    local dropRate = 0

    --#channels before entering the first denseblock
    local nChannels = 2 * growthRate

    --compression rate at transition layers
    local reduction = 0.5

    --whether to use bottleneck structures
    local bottleneck = false

Output of the parameter size

599050
[torch.LongStorage of size 1]

The text was updated successfully, but these errors were encountered:

liuzhuang13 · 2017-04-26T09:30:50Z

Hi! "BC" stands for bottleneck(B) and compression(C). This is explained at the "compression" paragraph at section 3 of the paper. To use a original DenseNet, you need to also set the variable "reduction" to 1 in the code.

shamangary · 2017-04-26T09:34:48Z

Thank you very much. It matched now.

shamangary · 2017-04-26T09:38:41Z

On the otherhand, the amount of parameters of DenseNet is small indeed, but the GPU memory will still be consumed by the complex structure instead of the parameters.

By using the 8GB GPU, I was able to run 11M parameters WRN.
However, I cannot run 0.8M parameters DenseNet-BC(L=100,k=12) since out-of-memory problem.
This might be caused by a lot of feature maps are stored during training.

liuzhuang13 · 2017-04-26T12:07:11Z

Thanks for pointing out. I've just found other people discussing this, and wrote a comment on reddit here https://www.reddit.com/r/MachineLearning/comments/67fds7/d_how_does_densenet_compare_to_resnet_and/?utm_content=title&utm_medium=hot&utm_source=reddit&utm_name=MachineLearning

My suggestion is that trying a shallow and wide densenet, by setting depth smaller and growthRate larger.

Tongcheng · 2017-04-26T22:03:23Z

Hello @shamangary , regarding the memory cost of feature maps, currently we have a Caffe implementation which trys to address the memory hungry problem (listed under much more spatial efficient caffe implementation), the DenseNet-BC (L=100,k=12) should take no more than 2.5 GB when running with test on, about 1.7 GB when running without test mode. (Caffe seems to allocate separate spaces for testing.) Hope that would help!

shamangary · 2017-04-27T02:21:48Z

OK. Thanks! Despite I wish Torch can also have such property. (QAQ)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The amount of parameters #14

The amount of parameters #14

shamangary commented Apr 26, 2017 •

edited

Loading

liuzhuang13 commented Apr 26, 2017

shamangary commented Apr 26, 2017

shamangary commented Apr 26, 2017 •

edited

Loading

liuzhuang13 commented Apr 26, 2017

Tongcheng commented Apr 26, 2017

shamangary commented Apr 27, 2017

The amount of parameters #14

The amount of parameters #14

Comments

shamangary commented Apr 26, 2017 • edited Loading

liuzhuang13 commented Apr 26, 2017

shamangary commented Apr 26, 2017

shamangary commented Apr 26, 2017 • edited Loading

liuzhuang13 commented Apr 26, 2017

Tongcheng commented Apr 26, 2017

shamangary commented Apr 27, 2017

shamangary commented Apr 26, 2017 •

edited

Loading

shamangary commented Apr 26, 2017 •

edited

Loading