You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I met a problem when i testing the pre-train part of your code. I use: bash script/run_pt.sh. to follow your Start Training part in README, and find that the process is blocked at the Epoch 1 /100. Eventually the process will be forcibly killed. I also tried to interrupt the process and found that it stuck at reading the length of dataloader. I wonder if this is due to hardware requirements that don't support pre-training(using RTX 3090), and looking forward to your reply very much.
The output is as follow:
The text was updated successfully, but these errors were encountered:
Hi, Ruiwen,
Sorry for the late reply.
I have tested the pre-train code just now (RTX 3090), but I didn't meet any wrong, can you provide more information?
Thx
Hi, Yang,
Some of the packages i use are not in the same version mentioned in enviroment.yaml which may cause the occurrence of this issue. When i using: conda env create -f environment.yaml, this process will shutdown in the middle. I fix this problem by rewriting the BatchSampler function, cuz i found that the loading of dataset is stuck at loading the first data of dataset into sampler while i dont konw how this issue occurs. At last, thank u for your testing!
I met a problem when i testing the pre-train part of your code. I use: bash script/run_pt.sh. to follow your Start Training part in README, and find that the process is blocked at the Epoch 1 /100. Eventually the process will be forcibly killed. I also tried to interrupt the process and found that it stuck at reading the length of dataloader. I wonder if this is due to hardware requirements that don't support pre-training(using RTX 3090), and looking forward to your reply very much.
The output is as follow:
The text was updated successfully, but these errors were encountered: