Replies: 3 comments 2 replies
-
Maybe you need to check your dataset. Similar error: https://datascience.stackexchange.com/questions/55777/valueerror-cannot-feed-value-of-shape-3-for-tensor-x0-which-has-shape fyi |
Beta Was this translation helpful? Give feedback.
0 replies
-
Is the problem is resolved ? |
Beta Was this translation helpful? Give feedback.
2 replies
-
Duplicate of deepmodeling/deepmd-kit#1399 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, I am having this problem at runtime (train.log)and am looking for help. The error file and root file are attached below
nohup: ignoring input
/home/yhpu/.local/lib/python3.10/site-packages/gromacs/init.py:286: GromacsImportWarning: Some Gromacs commands were NOT found; maybe source GMXRC first? The following are missing:
['release']
warnings.warn("Some Gromacs commands were NOT found; "
INFO:dpgen:-------------------------iter.000000 task 01--------------------------
Traceback (most recent call last):
File "/home/yhpu/.local/lib/python3.10/site-packages/dpdispatcher/submission.py", line 245, in handle_unexpected_submission_state
job.handle_unexpected_job_state()
File "/home/yhpu/.local/lib/python3.10/site-packages/dpdispatcher/submission.py", line 609, in handle_unexpected_job_state
raise RuntimeError(f"job:{self.job_hash} {self.job_id} failed {self.fail_count} times.job_detail:{self}")
RuntimeError: job:327fdb73938a19765b0de10800ddaf859a3a0b9a 91597 failed 3 times.job_detail:{'327fdb73938a19765b0de10800ddaf859a3a0b9a': {'job_task_list': [{'command': "/bin/sh -c '{ if [ ! -f model.ckpt.index ]; then dp train input.json; else dp train input.json --restart model.ckpt; fi }'&&dp freeze", 'task_work_path': '000', 'forward_files': ['input.json'], 'backward_files': ['frozen_model.pb', 'lcurve.out', 'train.log', 'model.ckpt.meta', 'model.ckpt.index', 'model.ckpt.data-00000-of-00001', 'checkpoint'], 'outlog': 'train.log', 'errlog': 'train.log'}], 'resources': {'number_node': 1, 'cpu_per_node': 36, 'gpu_per_node': 0, 'queue_name': 'CD', 'group_size': 3, 'custom_flags': ['#SBATCH -J 1'], 'strategy': {'if_cuda_multi_devices': False, 'ratio_unfinished': 0.0}, 'para_deg': 1, 'module_purge': False, 'module_unload_list': [], 'module_list': [], 'source_list': [], 'envs': {}, 'wait_time': 0, 'kwargs': {}}, 'job_state': <JobStatus.terminated: 4>, 'job_id': '91597', 'fail_count': 3}}
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/yhpu/.local/bin/dpgen", line 8, in
sys.exit(main())
File "/home/yhpu/.local/lib/python3.10/site-packages/dpgen/main.py", line 185, in main
args.func(args)
File "/home/yhpu/.local/lib/python3.10/site-packages/dpgen/generator/run.py", line 3642, in gen_run
run_iter (args.PARAM, args.MACHINE)
File "/home/yhpu/.local/lib/python3.10/site-packages/dpgen/generator/run.py", line 3607, in run_iter
run_train (ii, jdata, mdata)
File "/home/yhpu/.local/lib/python3.10/site-packages/dpgen/generator/run.py", line 610, in run_train
submission.run_submission()
File "/home/yhpu/.local/lib/python3.10/site-packages/dpdispatcher/submission.py", line 212, in run_submission
self.handle_unexpected_submission_state()
File "/home/yhpu/.local/lib/python3.10/site-packages/dpdispatcher/submission.py", line 248, in handle_unexpected_submission_state
raise RuntimeError(
RuntimeError: Meet errors will handle unexpected submission state.
Debug information: remote_root==/home/yhpu/tmp/94b8b7cdfc040b6b4c9e162e4b12e16ff0baf34b.
Debug information: submission_hash==94b8b7cdfc040b6b4c9e162e4b12e16ff0baf34b.
Please check the dirs and scripts in remote_root. The job information mentioned above may help.
train.log
WARNING:tensorflow:From /home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS.
/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/compat.py:316: UserWarning: It seems that you are using a deepmd-kit input of version 1.x.x, which is deprecated. we have converted the input to >2.0.0 compatible
warnings.warn(msg)
DEEPMD INFO Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step)
DEEPMD INFO training data with min nbor dist: 2.287857268309131
DEEPMD INFO training data with max nbor size: [58]
DEEPMD INFO _____ _____ __ __ _____ _ _ _
DEEPMD INFO | __ \ | __ \ | / || __ \ | | ()| |
DEEPMD INFO | | | | ___ ___ | |__) || \ / || | | | ______ | | __ _ | |
DEEPMD INFO | | | | / _ \ / _ | / | |/| || | | |||| |/ /| || |
DEEPMD INFO | || || /| /| | | | | || || | | < | || |
DEEPMD INFO |/ _| _||| || |_||/ ||_|| __|
DEEPMD INFO Please read and cite:
DEEPMD INFO Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
DEEPMD INFO installed to: /home/conda/feedstock_root/build_artifacts/deepmd-kit_1660885537494/work/_skbuild/linux-x86_64-3.10/cmake-install
DEEPMD INFO source : v2.1.4
DEEPMD INFO source brach: HEAD
DEEPMD INFO source commit: c3933924
DEEPMD INFO source commit at: 2022-08-19 12:09:24 +0800
DEEPMD INFO build float prec: double
DEEPMD INFO build variant: cpu
DEEPMD INFO build with tf inc: /home/software/deepmd/lib/python3.10/site-packages/tensorflow/include
DEEPMD INFO build with tf lib:
DEEPMD INFO ---Summary of the training---------------------------------------
DEEPMD INFO running on: d3
DEEPMD INFO computing device: cpu:0
DEEPMD INFO CUDA_VISIBLE_DEVICES: unset
DEEPMD INFO Count of visible GPU: 0
DEEPMD INFO num_intra_threads: 0
DEEPMD INFO num_inter_threads: 0
DEEPMD INFO -----------------------------------------------------------------
DEEPMD INFO ---Summary of DataSystem: training -----------------------------------------------
DEEPMD INFO found 3 system(s):
DEEPMD INFO system natoms bch_sz n_bch prob pbc
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 672 0.268 T
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0032/deepmd 32 1 1200 0.479 T
DEEPMD INFO -- cpPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 632 0.252 T
DEEPMD INFO --------------------------------------------------------------------------------------
DEEPMD INFO training without frame parameter
DEEPMD INFO data stating... (this step may take long time)
Traceback (most recent call last):
File "/home/software/deepmd/bin/dp", line 10, in
sys.exit(main())
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/main.py", line 516, in main
train_dp(**dict_args)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 106, in train
_do_work(jdata, run_opt, is_compress)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 162, in _do_work
model.build(train_data, stop_batch)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/train/trainer.py", line 300, in build
self.model.data_stat(data)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 95, in data_stat
self._compute_input_stat(m_all_stat, protection = self.data_stat_protect)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 100, in _compute_input_stat
self.descrpt.compute_input_stats(all_stat['coord'],
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 274, in compute_input_stats
= self._compute_dstats_sys_smth(cc,bb,tt,nn,mm)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 603, in _compute_dstats_sys_smth
= run_sess(self.sub_sess, self.stat_descrpt,
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/sess.py", line 21, in run_sess
return sess.run(*args, **kwargs)
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1164, in run
raise ValueError(
ValueError: Cannot feed value of shape (3,) for Tensor d_sea_t_natoms:0, which has shape (4,)
WARNING:tensorflow:From /home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS.
/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/compat.py:316: UserWarning: It seems that you are using a deepmd-kit input of version 1.x.x, which is deprecated. we have converted the input to >2.0.0 compatible
warnings.warn(msg)
DEEPMD INFO Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step)
DEEPMD INFO training data with min nbor dist: 2.287857268309131
DEEPMD INFO training data with max nbor size: [58]
DEEPMD INFO _____ _____ __ __ _____ _ _ _
DEEPMD INFO | __ \ | __ \ | / || __ \ | | ()| |
DEEPMD INFO | | | | ___ ___ | |) || \ / || | | | ______ | | __ _ | |_
DEEPMD INFO | | | | / _ \ / _ | / | |/| || | | |||| |/ /| || |
DEEPMD INFO | || || /| /| | | | | || || | | < | || |
DEEPMD INFO |/ _| _||| || |_||/ ||_|| __|
DEEPMD INFO Please read and cite:
DEEPMD INFO Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
DEEPMD INFO installed to: /home/conda/feedstock_root/build_artifacts/deepmd-kit_1660885537494/work/_skbuild/linux-x86_64-3.10/cmake-install
DEEPMD INFO source : v2.1.4
DEEPMD INFO source brach: HEAD
DEEPMD INFO source commit: c3933924
DEEPMD INFO source commit at: 2022-08-19 12:09:24 +0800
DEEPMD INFO build float prec: double
DEEPMD INFO build variant: cpu
DEEPMD INFO build with tf inc: /home/software/deepmd/lib/python3.10/site-packages/tensorflow/include
DEEPMD INFO build with tf lib:
DEEPMD INFO ---Summary of the training---------------------------------------
DEEPMD INFO running on: d3
DEEPMD INFO computing device: cpu:0
DEEPMD INFO CUDA_VISIBLE_DEVICES: unset
DEEPMD INFO Count of visible GPU: 0
DEEPMD INFO num_intra_threads: 0
DEEPMD INFO num_inter_threads: 0
DEEPMD INFO -----------------------------------------------------------------
DEEPMD INFO ---Summary of DataSystem: training -----------------------------------------------
DEEPMD INFO found 3 system(s):
DEEPMD INFO system natoms bch_sz n_bch prob pbc
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 672 0.268 T
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0032/deepmd 32 1 1200 0.479 T
DEEPMD INFO -- cpPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 632 0.252 T
DEEPMD INFO --------------------------------------------------------------------------------------
DEEPMD INFO training without frame parameter
DEEPMD INFO data stating... (this step may take long time)
Traceback (most recent call last):
File "/home/software/deepmd/bin/dp", line 10, in
sys.exit(main())
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/main.py", line 516, in main
train_dp(**dict_args)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 106, in train
_do_work(jdata, run_opt, is_compress)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 162, in _do_work
model.build(train_data, stop_batch)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/train/trainer.py", line 300, in build
self.model.data_stat(data)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 95, in data_stat
self._compute_input_stat(m_all_stat, protection = self.data_stat_protect)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 100, in _compute_input_stat
self.descrpt.compute_input_stats(all_stat['coord'],
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 274, in compute_input_stats
= self._compute_dstats_sys_smth(cc,bb,tt,nn,mm)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 603, in _compute_dstats_sys_smth
= run_sess(self.sub_sess, self.stat_descrpt,
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/sess.py", line 21, in run_sess
return sess.run(*args, **kwargs)
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1164, in run
raise ValueError(
ValueError: Cannot feed value of shape (3,) for Tensor d_sea_t_natoms:0, which has shape (4,)
WARNING:tensorflow:From /home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS.
/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/compat.py:316: UserWarning: It seems that you are using a deepmd-kit input of version 1.x.x, which is deprecated. we have converted the input to >2.0.0 compatible
warnings.warn(msg)
DEEPMD INFO Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step)
DEEPMD INFO training data with min nbor dist: 2.287857268309131
DEEPMD INFO training data with max nbor size: [58]
DEEPMD INFO _____ _____ __ __ _____ _ _ _
DEEPMD INFO | __ \ | __ \ | / || __ \ | | ()| |
DEEPMD INFO | | | | ___ ___ | |) || \ / || | | | ______ | | __ _ | |_
DEEPMD INFO | | | | / _ \ / _ | / | |/| || | | |||| |/ /| || |
DEEPMD INFO | || || /| /| | | | | || || | | < | || |
DEEPMD INFO |/ _| _||| || |_||____/ ||_|| __|
DEEPMD INFO Please read and cite:
DEEPMD INFO Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
DEEPMD INFO installed to: /home/conda/feedstock_root/build_artifacts/deepmd-kit_1660885537494/work/_skbuild/linux-x86_64-3.10/cmake-install
DEEPMD INFO source : v2.1.4
DEEPMD INFO source brach: HEAD
DEEPMD INFO source commit: c3933924
DEEPMD INFO source commit at: 2022-08-19 12:09:24 +0800
DEEPMD INFO build float prec: double
DEEPMD INFO build variant: cpu
DEEPMD INFO build with tf inc: /home/software/deepmd/lib/python3.10/site-packages/tensorflow/include
DEEPMD INFO build with tf lib:
DEEPMD INFO ---Summary of the training---------------------------------------
DEEPMD INFO running on: d3
DEEPMD INFO computing device: cpu:0
DEEPMD INFO CUDA_VISIBLE_DEVICES: unset
DEEPMD INFO Count of visible GPU: 0
DEEPMD INFO num_intra_threads: 0
DEEPMD INFO num_inter_threads: 0
DEEPMD INFO -----------------------------------------------------------------
DEEPMD INFO ---Summary of DataSystem: training -----------------------------------------------
DEEPMD INFO found 3 system(s):
DEEPMD INFO system natoms bch_sz n_bch prob pbc
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 672 0.268 T
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0032/deepmd 32 1 1200 0.479 T
DEEPMD INFO -- cpPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 632 0.252 T
DEEPMD INFO --------------------------------------------------------------------------------------
DEEPMD INFO training without frame parameter
DEEPMD INFO data stating... (this step may take long time)
Traceback (most recent call last):
File "/home/software/deepmd/bin/dp", line 10, in
sys.exit(main())
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/main.py", line 516, in main
train_dp(**dict_args)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 106, in train
_do_work(jdata, run_opt, is_compress)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 162, in _do_work
model.build(train_data, stop_batch)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/train/trainer.py", line 300, in build
self.model.data_stat(data)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 95, in data_stat
self._compute_input_stat(m_all_stat, protection = self.data_stat_protect)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 100, in _compute_input_stat
self.descrpt.compute_input_stats(all_stat['coord'],
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 274, in compute_input_stats
= self._compute_dstats_sys_smth(cc,bb,tt,nn,mm)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 603, in _compute_dstats_sys_smth
= run_sess(self.sub_sess, self.stat_descrpt,
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/sess.py", line 21, in run_sess
return sess.run(*args, **kwargs)
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1164, in _run
raise ValueError(
ValueError: Cannot feed value of shape (3,) for Tensor d_sea_t_natoms:0, which has shape (4,)
Beta Was this translation helpful? Give feedback.
All reactions