-
Notifications
You must be signed in to change notification settings - Fork 19.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loading weights into custom LSTM layer fails: Layer 'lstm_cell' expected 3 variables, but received 0 variables during loading. Expected: ['kernel', 'recurrent_kernel', 'bias'] #20322
Comments
Hi @lbortolotti - Thanks for reporting the issue. Here as per your code while saving the model weights you used By taking Using below changes will resolve the error.
Attached gist for your reference as well. |
Hi. I'm perfectly aware that the model I'm loading the weights into has a "different" lstm layer. However, the models have 1) identical weight structure/dimensions and 2) identically named layers. In this situation, keras has always supported loading weights, even if the layer class has changed. This is something I've used all the time, and continues to work with tf-keras. My example is particularly extreme as MyCustomLSTM is absolutely identical to a vanilla LSTM layer - normally I'd have some custom logic in there (but nothing that affects the weight structure). The official doc confirms that my example should work, I think, as it says Weights are loaded based on the network's topology. : https://keras.io/api/models/model_saving_apis/weights_saving_and_loading/#loadweights-method |
Hi @lbortolotti - As per the load_weights method, Weights are loaded based on the network's topology but architecture should be the same as when the weights were saved(as per the load_weight document). So while saving weights, model architecture has InputLayer+ MyCustomLSTM layer(based on model summary) created by Attaching gist where running with keras2(tf-keras) and it is giving same error with |
Interesting. I've dug a bit deeper, and have found that to restore model loading functionality I have to change the file name from .weights.h5 to .h5. Literally just removing the ".weights." suffix (which is now a strict requirement in TF 2.17, as far as I can tell) resolves it, and I can transfer weights as I always have. To replicate, you just need to take your last gist, save the weights as Which behaviour is the expected behaviour? I was definitely using the functionality... |
I'm using the official TF 2.17 container (tensorflow/tensorflow:2.17.0-gpu-jupyter) + keras==3.5.0.
The following code saves a model which contains a (dummy) custom LSTM layer, then inits a new copy of the model (with a vanilla LSTM) and tries to load the weights from the first model into the second.
Code:
Output:
Considering that the custom layer in this case is doing absolutely nothing of interest, I assume this is a bug. If not, please let me know how one is meant to wrap a LSTM layer to avoid this issue.
Thanks!
The text was updated successfully, but these errors were encountered: