Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: The passed save_path is not a valid checkpoint: model/model.ckpt-43000 #15

Open
JianfengNing opened this issue Dec 6, 2021 · 8 comments

Comments

@JianfengNing
Copy link

How to fix this problem?

raceback (most recent call last):
File "D:/Desktop/Codes/deeponet-master/deeponet-master/src/deeponet_pde.py", line 285, in
main()
File "D:/Desktop/Codes/deeponet-master/deeponet-master/src/deeponet_pde.py", line 281, in main
run(problem, system, space, T, m, nn, net, lr, epochs, num_train, num_test)
File "D:/Desktop/Codes/deeponet-master/deeponet-master/src/deeponet_pde.py", line 176, in run
model.restore("model/model.ckpt-" + str(train_state.best_step), verbose=1)
File "D:\Users\FIVE\miniconda3\lib\site-packages\deepxde\model.py", line 666, in restore
self.saver.restore(self.sess, save_path)
File "D:\Users\FIVE\miniconda3\lib\site-packages\tensorflow\python\training\saver.py", line 1290, in restore
raise ValueError("The passed save_path is not a valid checkpoint: " +
ValueError: The passed save_path is not a valid checkpoint: model/model.ckpt-43000

@lululxvi
Copy link
Owner

lululxvi commented Dec 7, 2021

What is your backend?

@JianfengNing
Copy link
Author

JianfengNing commented Dec 12, 2021 via email

@MinZhu123
Copy link

You could replace the error line with model.restore(f"model/model-{train_state.best_step}.ckpt", verbose=1)

@cfd-ai
Copy link

cfd-ai commented Jan 29, 2022

@minzhu-penn - May I know which version of TensorFlow you use and which backend?
I had to do this:
model.restore("model/model.ckpt-" + str(train_state.best_step) + ".ckpt", verbose=1)
Because model files are like:
model.ckpt-500.ckpt.meta
model.ckpt-500.ckpt.index
model.ckpt-500.ckpt.data-00000-of-00001

Backend: tensorflow.compat.v1
TensorFlow version: 2.6.2

@DaJiang7
Copy link

@JianfengNing
I have the same problem as you. You can check the running memory of the computer.

@anshumansinha16
Copy link

I am getting this error as well! My backend is : Using backend: tensorflow.compat.v1'

code:

# Restore the best test loss model
    model.restore( save_dir +save_str+"/model.ckpt-" + str(np.argmin(model.losshistory.loss_test)*100), verbose=0)

Error:

Traceback (most recent call last):
  File "/Users/anshumansinha/Desktop/ML_project/./main.py", line 311, in <module>
    NN_MSEs_test, NN_MSEs_train = DeepONet(samples, split, y/np.max(np.abs(y)) , I, inds, neurons, epochs, b_layers)
  File "/Users/anshumansinha/Desktop/ML_project/./main.py", line 289, in DeepONet
    model.restore( save_dir +save_str+"/model.ckpt-" + str(np.argmin(model.losshistory.loss_test)*100), verbose=0)
  File "/Users/anshumansinha/venv/lib/python3.10/site-packages/deepxde/model.py", line 914, in restore
    self.saver.restore(self.sess, save_path)
  File "/Users/anshumansinha/venv/lib/python3.10/site-packages/tensorflow/python/training/saver.py", line 1409, in restore
    raise ValueError("The passed save_path is not a valid checkpoint: " +
ValueError: The passed save_path is not a valid checkpoint: /Users/anshumansinha/Desktop/ML_project/model/Levin1_Seed_1_Samples_100_X_4_5_epochs_10_blayers_3_neurons_125/model.ckpt-100

@anshumansinha16
Copy link

What is your backend?

I am using, `Using backend: tensorflow.compat.v1' and getting similar error : link

@lululxvi
Copy link
Owner

See FAQ Q: More details about DeepXDE source code, and want to modify DeepXDE at https://deepxde.readthedocs.io/en/latest/user/faq.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants