Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] overtime error #1718

Open
Liweiq1 opened this issue Feb 19, 2025 · 0 comments
Open

[BUG] overtime error #1718

Liweiq1 opened this issue Feb 19, 2025 · 0 comments
Labels
bug Something isn't working

Comments

@Liweiq1
Copy link

Liweiq1 commented Feb 19, 2025

Bug summary

I switched the version of dpgen from 0.12.0v to 0.13.0v, without changing param.json and machine.json. When running 'dpgen init_bulk ...', the task is killed after the first step, and the error is presented mentioned above. My param.json and machine.json are in the appendix.

machine.json
param.json

DP-GEN Version

0.13.0v

Platform, Python Version, Remote Platform, etc

No response

Input Files, Running Commands, Error Log, etc.

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/HOME/scz0rq0/.conda/envs/deepmd/bin/dpgen", line 8, in
sys.exit(main())
File "/HOME/scz0rq0/.conda/envs/deepmd/lib/python3.9/site-packages/dpgen/main.py", line 255, in main
args.func(args)
File "/HOME/scz0rq0/.conda/envs/deepmd/lib/python3.9/site-packages/dpgen/data/gen.py", line 1532, in gen_init_bulk
run_vasp_relax(jdata, mdata)
File "/HOME/scz0rq0/.conda/envs/deepmd/lib/python3.9/site-packages/dpgen/data/gen.py", line 1181, in run_vasp_relax
submission.run_submission()
File "/HOME/scz0rq0/.conda/envs/deepmd/lib/python3.9/site-packages/dpdispatcher/submission.py", line 260, in run_submission
self.update_submission_state()
File "/HOME/scz0rq0/.conda/envs/deepmd/lib/python3.9/site-packages/dpdispatcher/submission.py", line 345, in update_submission_state
job.get_job_state()
File "/HOME/scz0rq0/.conda/envs/deepmd/lib/python3.9/site-packages/dpdispatcher/submission.py", line 831, in get_job_state
job_state = self.machine.check_status(self)
File "/HOME/scz0rq0/.conda/envs/deepmd/lib/python3.9/site-packages/dpdispatcher/utils/utils.py", line 193, in wrapper
raise RuntimeError(
RuntimeError: Failed to run check_status for 3 times

Steps to Reproduce

nohup dpgen init_bulk param.json machine.json 1>log 2>err &

Further Information, Files, and Links

No response

@Liweiq1 Liweiq1 added the bug Something isn't working label Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant