-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bug in config_parse.py when batch_norm layer is used in RecurrentLayerGroup #966
Conversation
@@ -498,9 +498,12 @@ def __init__( | |||
is_static=None, | |||
is_shared=None, | |||
update_hooks=None, | |||
input_layer_argument=None, ): | |||
input_layer_argument=None, | |||
not_make_layer_name_in_submodel=None, ): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
请问这个变量到底之bool型的还是一个指针呀?
python经常把默认参数设置成None是有原因的,参考 http://effbot.org/zone/default-values.htm
对于简单类型,比如整数,bool型,那么默认值直接传就可以了。因为Python对于简单类型是值传递而不是引用传递。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
* Add offset mapping doc * fix eval hang because of unique endpoint * generate api support encoder-decoder
Fix #961
解决方法:
在 config_parse.py里构造 BatchNormLayer的另外两个Input()时,不经过函数 https://github.com/PaddlePaddle/Paddle/blob/develop/python/paddle/trainer/config_parser.py#L503
错误原因:
产生原因是 BatchNorm Layer里自动为moving average参数添加了Input():
因为调用顺序原因,这里引用的名字
inputs[0].input_layer_name
, 在RecurrentLayerGroup里是被函数MakeLayerNameInSubmodel()编码之后的名字,即input_layer_name + "@" + submodel_name
, 导致出错。其他解决办法:
如果在config_parse.py外面, 由用户自定添加,例如下面方式,这样两个Input(1)和Input(2)的name经过config_parse.py内部编码会与 Input(0) 保持一致, 不会出错。
这个可以在
trainer_config_helpers
里重新包装, 但是部分还是用老配置的用户,得手动自己添加,较麻烦。其他办法:
是否可以在config_parse.py里只添加 Parameter(), 而不添加Input()?
但Layer.cpp里init函数https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/gserver/layers/Layer.cpp#L57 解析时一个input对应一个参数,所以应该不可行。
可否合并为同一个参数?
因为 .w0 .w1, .w2的一些操作不同,也无法创建同一片空间,在Layer.cpp里用offset的方式构造。