Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trainable variables and learning rate for training #1

Open
xie9187 opened this issue Apr 21, 2017 · 2 comments
Open

Trainable variables and learning rate for training #1

xie9187 opened this issue Apr 21, 2017 · 2 comments

Comments

@xie9187
Copy link

xie9187 commented Apr 21, 2017

According to the paper, we need to train the coarse layers at first, and than fix it and train the refine part. And I think in this code, we only need turn the flag REFINE_TRAIN into False to do the first step and than REFINE_TRAIN=True to do the second step right?

However after I set REFINE_TRAIN=True, I found all variables from coarse network were still trainable. I think it is because the setting of trainable flag in the function '_variable_on_gpu' in model_part.py file is neglected.

Another problem is about the learning rate. According to the paper, learning rates are 0.001 for coarse convolutional layers 1-5, 0.1 for coarse full layers 6 and 7, 0.001 for fine layers 1 and 3, and 0.01 for fine layer 2. But the initial learning rates are set to 0.0001 for all layers in the code. With this learning rate, I cannot get a good result even after training for more than two days, compared with the performance mentioned in the paper. So I'm just wondering does anyone get a good result with this code and how to train the network to obtain such a result?

At last, thanks for providing such a clean and readable implementation :)

@MasazI
Copy link
Owner

MasazI commented Apr 30, 2017

Thank you for your comment.
Firstly, you are right regarding "_variable_on_gpu", it is a bug in my migration processing for the public version.
2nd, about learning rates, I'm not sure how to change learning rates each layer.
So If possible, could you tell me the way? but, as far as I remember, the paper does not use Adam optimizer. Adam can adjust learning rate for each weight.
Thanks.

@josslynZcn
Copy link

@xie9187 i'm sorry to interrupt and i wonder if you have train this network with good result.

i have faced with the same problem with u and i wonder i f i could use great help from you .
thanks a lot .
[email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants