Replies: 3 comments 2 replies
-
Maybe you need to check your dataset. Similar error: https://datascience.stackexchange.com/questions/55777/valueerror-cannot-feed-value-of-shape-3-for-tensor-x0-which-has-shape fyi |
Beta Was this translation helpful? Give feedback.
0 replies
-
Is the problem is resolved ? |
Beta Was this translation helpful? Give feedback.
2 replies
-
Duplicate of deepmodeling/deepmd-kit#1399 |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I am having this problem at runtime (train.log)and am looking for help. The error file and root file are attached below
nohup: ignoring input
/home/yhpu/.local/lib/python3.10/site-packages/gromacs/init.py:286: GromacsImportWarning: Some Gromacs commands were NOT found; maybe source GMXRC first? The following are missing:
['release']
warnings.warn("Some Gromacs commands were NOT found; "
INFO:dpgen:-------------------------iter.000000 task 01--------------------------
Traceback (most recent call last):
File "/home/yhpu/.local/lib/python3.10/site-packages/dpdispatcher/submission.py", line 245, in handle_unexpected_submission_state
job.handle_unexpected_job_state()
File "/home/yhpu/.local/lib/python3.10/site-packages/dpdispatcher/submission.py", line 609, in handle_unexpected_job_state
raise RuntimeError(f"job:{self.job_hash} {self.job_id} failed {self.fail_count} times.job_detail:{self}")
RuntimeError: job:327fdb73938a19765b0de10800ddaf859a3a0b9a 91597 failed 3 times.job_detail:{'327fdb73938a19765b0de10800ddaf859a3a0b9a': {'job_task_list': [{'command': "/bin/sh -c '{ if [ ! -f model.ckpt.index ]; then dp train input.json; else dp train input.json --restart model.ckpt; fi }'&&dp freeze", 'task_work_path': '000', 'forward_files': ['input.json'], 'backward_files': ['frozen_model.pb', 'lcurve.out', 'train.log', 'model.ckpt.meta', 'model.ckpt.index', 'model.ckpt.data-00000-of-00001', 'checkpoint'], 'outlog': 'train.log', 'errlog': 'train.log'}], 'resources': {'number_node': 1, 'cpu_per_node': 36, 'gpu_per_node': 0, 'queue_name': 'CD', 'group_size': 3, 'custom_flags': ['#SBATCH -J 1'], 'strategy': {'if_cuda_multi_devices': False, 'ratio_unfinished': 0.0}, 'para_deg': 1, 'module_purge': False, 'module_unload_list': [], 'module_list': [], 'source_list': [], 'envs': {}, 'wait_time': 0, 'kwargs': {}}, 'job_state': <JobStatus.terminated: 4>, 'job_id': '91597', 'fail_count': 3}}
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/yhpu/.local/bin/dpgen", line 8, in
sys.exit(main())
File "/home/yhpu/.local/lib/python3.10/site-packages/dpgen/main.py", line 185, in main
args.func(args)
File "/home/yhpu/.local/lib/python3.10/site-packages/dpgen/generator/run.py", line 3642, in gen_run
run_iter (args.PARAM, args.MACHINE)
File "/home/yhpu/.local/lib/python3.10/site-packages/dpgen/generator/run.py", line 3607, in run_iter
run_train (ii, jdata, mdata)
File "/home/yhpu/.local/lib/python3.10/site-packages/dpgen/generator/run.py", line 610, in run_train
submission.run_submission()
File "/home/yhpu/.local/lib/python3.10/site-packages/dpdispatcher/submission.py", line 212, in run_submission
self.handle_unexpected_submission_state()
File "/home/yhpu/.local/lib/python3.10/site-packages/dpdispatcher/submission.py", line 248, in handle_unexpected_submission_state
raise RuntimeError(
RuntimeError: Meet errors will handle unexpected submission state.
Debug information: remote_root==/home/yhpu/tmp/94b8b7cdfc040b6b4c9e162e4b12e16ff0baf34b.
Debug information: submission_hash==94b8b7cdfc040b6b4c9e162e4b12e16ff0baf34b.
Please check the dirs and scripts in remote_root. The job information mentioned above may help.
train.log
WARNING:tensorflow:From /home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS.
/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/compat.py:316: UserWarning: It seems that you are using a deepmd-kit input of version 1.x.x, which is deprecated. we have converted the input to >2.0.0 compatible
warnings.warn(msg)
DEEPMD INFO Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step)
DEEPMD INFO training data with min nbor dist: 2.287857268309131
DEEPMD INFO training data with max nbor size: [58]
DEEPMD INFO _____ _____ __ __ _____ _ _ _
DEEPMD INFO | __ \ | __ \ | / || __ \ | | ()| |
DEEPMD INFO | | | | ___ ___ | |__) || \ / || | | | ______ | | __ _ | |
DEEPMD INFO | | | | / _ \ / _ | / | |/| || | | |||| |/ /| || |
DEEPMD INFO | || || /| /| | | | | || || | | < | || |
DEEPMD INFO |/ _| _||| || |_||/ ||_|| __|
DEEPMD INFO Please read and cite:
DEEPMD INFO Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
DEEPMD INFO installed to: /home/conda/feedstock_root/build_artifacts/deepmd-kit_1660885537494/work/_skbuild/linux-x86_64-3.10/cmake-install
DEEPMD INFO source : v2.1.4
DEEPMD INFO source brach: HEAD
DEEPMD INFO source commit: c3933924
DEEPMD INFO source commit at: 2022-08-19 12:09:24 +0800
DEEPMD INFO build float prec: double
DEEPMD INFO build variant: cpu
DEEPMD INFO build with tf inc: /home/software/deepmd/lib/python3.10/site-packages/tensorflow/include
DEEPMD INFO build with tf lib:
DEEPMD INFO ---Summary of the training---------------------------------------
DEEPMD INFO running on: d3
DEEPMD INFO computing device: cpu:0
DEEPMD INFO CUDA_VISIBLE_DEVICES: unset
DEEPMD INFO Count of visible GPU: 0
DEEPMD INFO num_intra_threads: 0
DEEPMD INFO num_inter_threads: 0
DEEPMD INFO -----------------------------------------------------------------
DEEPMD INFO ---Summary of DataSystem: training -----------------------------------------------
DEEPMD INFO found 3 system(s):
DEEPMD INFO system natoms bch_sz n_bch prob pbc
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 672 0.268 T
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0032/deepmd 32 1 1200 0.479 T
DEEPMD INFO -- cpPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 632 0.252 T
DEEPMD INFO --------------------------------------------------------------------------------------
DEEPMD INFO training without frame parameter
DEEPMD INFO data stating... (this step may take long time)
Traceback (most recent call last):
File "/home/software/deepmd/bin/dp", line 10, in
sys.exit(main())
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/main.py", line 516, in main
train_dp(**dict_args)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 106, in train
_do_work(jdata, run_opt, is_compress)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 162, in _do_work
model.build(train_data, stop_batch)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/train/trainer.py", line 300, in build
self.model.data_stat(data)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 95, in data_stat
self._compute_input_stat(m_all_stat, protection = self.data_stat_protect)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 100, in _compute_input_stat
self.descrpt.compute_input_stats(all_stat['coord'],
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 274, in compute_input_stats
= self._compute_dstats_sys_smth(cc,bb,tt,nn,mm)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 603, in _compute_dstats_sys_smth
= run_sess(self.sub_sess, self.stat_descrpt,
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/sess.py", line 21, in run_sess
return sess.run(*args, **kwargs)
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1164, in run
raise ValueError(
ValueError: Cannot feed value of shape (3,) for Tensor d_sea_t_natoms:0, which has shape (4,)
WARNING:tensorflow:From /home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS.
/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/compat.py:316: UserWarning: It seems that you are using a deepmd-kit input of version 1.x.x, which is deprecated. we have converted the input to >2.0.0 compatible
warnings.warn(msg)
DEEPMD INFO Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step)
DEEPMD INFO training data with min nbor dist: 2.287857268309131
DEEPMD INFO training data with max nbor size: [58]
DEEPMD INFO _____ _____ __ __ _____ _ _ _
DEEPMD INFO | __ \ | __ \ | / || __ \ | | ()| |
DEEPMD INFO | | | | ___ ___ | |) || \ / || | | | ______ | | __ _ | |_
DEEPMD INFO | | | | / _ \ / _ | / | |/| || | | |||| |/ /| || |
DEEPMD INFO | || || /| /| | | | | || || | | < | || |
DEEPMD INFO |/ _| _||| || |_||/ ||_|| __|
DEEPMD INFO Please read and cite:
DEEPMD INFO Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
DEEPMD INFO installed to: /home/conda/feedstock_root/build_artifacts/deepmd-kit_1660885537494/work/_skbuild/linux-x86_64-3.10/cmake-install
DEEPMD INFO source : v2.1.4
DEEPMD INFO source brach: HEAD
DEEPMD INFO source commit: c3933924
DEEPMD INFO source commit at: 2022-08-19 12:09:24 +0800
DEEPMD INFO build float prec: double
DEEPMD INFO build variant: cpu
DEEPMD INFO build with tf inc: /home/software/deepmd/lib/python3.10/site-packages/tensorflow/include
DEEPMD INFO build with tf lib:
DEEPMD INFO ---Summary of the training---------------------------------------
DEEPMD INFO running on: d3
DEEPMD INFO computing device: cpu:0
DEEPMD INFO CUDA_VISIBLE_DEVICES: unset
DEEPMD INFO Count of visible GPU: 0
DEEPMD INFO num_intra_threads: 0
DEEPMD INFO num_inter_threads: 0
DEEPMD INFO -----------------------------------------------------------------
DEEPMD INFO ---Summary of DataSystem: training -----------------------------------------------
DEEPMD INFO found 3 system(s):
DEEPMD INFO system natoms bch_sz n_bch prob pbc
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 672 0.268 T
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0032/deepmd 32 1 1200 0.479 T
DEEPMD INFO -- cpPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 632 0.252 T
DEEPMD INFO --------------------------------------------------------------------------------------
DEEPMD INFO training without frame parameter
DEEPMD INFO data stating... (this step may take long time)
Traceback (most recent call last):
File "/home/software/deepmd/bin/dp", line 10, in
sys.exit(main())
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/main.py", line 516, in main
train_dp(**dict_args)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 106, in train
_do_work(jdata, run_opt, is_compress)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 162, in _do_work
model.build(train_data, stop_batch)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/train/trainer.py", line 300, in build
self.model.data_stat(data)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 95, in data_stat
self._compute_input_stat(m_all_stat, protection = self.data_stat_protect)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 100, in _compute_input_stat
self.descrpt.compute_input_stats(all_stat['coord'],
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 274, in compute_input_stats
= self._compute_dstats_sys_smth(cc,bb,tt,nn,mm)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 603, in _compute_dstats_sys_smth
= run_sess(self.sub_sess, self.stat_descrpt,
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/sess.py", line 21, in run_sess
return sess.run(*args, **kwargs)
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1164, in run
raise ValueError(
ValueError: Cannot feed value of shape (3,) for Tensor d_sea_t_natoms:0, which has shape (4,)
WARNING:tensorflow:From /home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/compat/v2_compat.py:107: disable_resource_variables (from tensorflow.python.ops.variable_scope) is deprecated and will be removed in a future version.
Instructions for updating:
non-resource variables are not supported in the long term
WARNING:root:To get the best performance, it is recommended to adjust the number of threads by setting the environment variables OMP_NUM_THREADS, TF_INTRA_OP_PARALLELISM_THREADS, and TF_INTER_OP_PARALLELISM_THREADS.
/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/compat.py:316: UserWarning: It seems that you are using a deepmd-kit input of version 1.x.x, which is deprecated. we have converted the input to >2.0.0 compatible
warnings.warn(msg)
DEEPMD INFO Calculate neighbor statistics... (add --skip-neighbor-stat to skip this step)
DEEPMD INFO training data with min nbor dist: 2.287857268309131
DEEPMD INFO training data with max nbor size: [58]
DEEPMD INFO _____ _____ __ __ _____ _ _ _
DEEPMD INFO | __ \ | __ \ | / || __ \ | | ()| |
DEEPMD INFO | | | | ___ ___ | |) || \ / || | | | ______ | | __ _ | |_
DEEPMD INFO | | | | / _ \ / _ | / | |/| || | | |||| |/ /| || |
DEEPMD INFO | || || /| /| | | | | || || | | < | || |
DEEPMD INFO |/ _| _||| || |_||____/ ||_|| __|
DEEPMD INFO Please read and cite:
DEEPMD INFO Wang, Zhang, Han and E, Comput.Phys.Comm. 228, 178-184 (2018)
DEEPMD INFO installed to: /home/conda/feedstock_root/build_artifacts/deepmd-kit_1660885537494/work/_skbuild/linux-x86_64-3.10/cmake-install
DEEPMD INFO source : v2.1.4
DEEPMD INFO source brach: HEAD
DEEPMD INFO source commit: c3933924
DEEPMD INFO source commit at: 2022-08-19 12:09:24 +0800
DEEPMD INFO build float prec: double
DEEPMD INFO build variant: cpu
DEEPMD INFO build with tf inc: /home/software/deepmd/lib/python3.10/site-packages/tensorflow/include
DEEPMD INFO build with tf lib:
DEEPMD INFO ---Summary of the training---------------------------------------
DEEPMD INFO running on: d3
DEEPMD INFO computing device: cpu:0
DEEPMD INFO CUDA_VISIBLE_DEVICES: unset
DEEPMD INFO Count of visible GPU: 0
DEEPMD INFO num_intra_threads: 0
DEEPMD INFO num_inter_threads: 0
DEEPMD INFO -----------------------------------------------------------------
DEEPMD INFO ---Summary of DataSystem: training -----------------------------------------------
DEEPMD INFO found 3 system(s):
DEEPMD INFO system natoms bch_sz n_bch prob pbc
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 672 0.268 T
DEEPMD INFO -- ccPOSCAR.01x01x01/02.md/sys-0032/deepmd 32 1 1200 0.479 T
DEEPMD INFO -- cpPOSCAR.01x01x01/02.md/sys-0016/deepmd 16 2 632 0.252 T
DEEPMD INFO --------------------------------------------------------------------------------------
DEEPMD INFO training without frame parameter
DEEPMD INFO data stating... (this step may take long time)
Traceback (most recent call last):
File "/home/software/deepmd/bin/dp", line 10, in
sys.exit(main())
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/main.py", line 516, in main
train_dp(**dict_args)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 106, in train
_do_work(jdata, run_opt, is_compress)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/entrypoints/train.py", line 162, in _do_work
model.build(train_data, stop_batch)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/train/trainer.py", line 300, in build
self.model.data_stat(data)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 95, in data_stat
self._compute_input_stat(m_all_stat, protection = self.data_stat_protect)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/model/ener.py", line 100, in _compute_input_stat
self.descrpt.compute_input_stats(all_stat['coord'],
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 274, in compute_input_stats
= self._compute_dstats_sys_smth(cc,bb,tt,nn,mm)
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/descriptor/se_a.py", line 603, in _compute_dstats_sys_smth
= run_sess(self.sub_sess, self.stat_descrpt,
File "/home/software/deepmd/lib/python3.10/site-packages/deepmd/utils/sess.py", line 21, in run_sess
return sess.run(*args, **kwargs)
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 967, in run
result = self._run(None, fetches, feed_dict, options_ptr,
File "/home/software/deepmd/lib/python3.10/site-packages/tensorflow/python/client/session.py", line 1164, in _run
raise ValueError(
ValueError: Cannot feed value of shape (3,) for Tensor d_sea_t_natoms:0, which has shape (4,)
Beta Was this translation helpful? Give feedback.
All reactions