Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

运行TFA算法出现nan #130

Open
kokoronokasumi opened this issue May 16, 2023 · 4 comments
Open

运行TFA算法出现nan #130

kokoronokasumi opened this issue May 16, 2023 · 4 comments

Comments

@kokoronokasumi
Copy link

我按照readme文档配置好了mmfewshot和voc数据集,当我用自带的配置文件运行TFA算法的base-training时,迭代次数超过100后就会nan,请问可能的原因是什么?

2023-05-16 15:59:36,050 - mmfewshot - INFO - Iter [50/18000] lr: 9.810e-03, eta: 1:13:05, time: 0.244, data_time: 0.007, memory: 7041, loss_rpn_cls: 0.2213, loss_rpn_bbox: 0.0396, loss_cls: 0.5162, acc: 92.1133, loss_bbox: 0.0955, loss: 0.8727
2023-05-16 15:59:48,591 - mmfewshot - INFO - Iter [100/18000] lr: 1.980e-02, eta: 1:13:52, time: 0.251, data_time: 0.007, memory: 7041, loss_rpn_cls: 0.1074, loss_rpn_bbox: 0.0503, loss_cls: 0.2786, acc: 96.0000, loss_bbox: 0.1581, loss: 0.5944
2023-05-16 16:00:00,323 - mmfewshot - INFO - Iter [150/18000] lr: 2.000e-02, eta: 1:12:24, time: 0.235, data_time: 0.006, memory: 7041, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 79.5828, loss_bbox: nan, loss: nan
2023-05-16 16:00:10,732 - mmfewshot - INFO - Iter [200/18000] lr: 2.000e-02, eta: 1:09:35, time: 0.208, data_time: 0.006, memory: 7041, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 2.6863, loss_bbox: nan, loss: nan
2023-05-16 16:00:21,427 - mmfewshot - INFO - Iter [250/18000] lr: 2.000e-02, eta: 1:08:09, time: 0.214, data_time: 0.009, memory: 7041, loss_rpn_cls: nan, loss_rpn_bbox: nan, loss_cls: nan, acc: 1.0000, loss_bbox: nan, loss: nan

@mm-assistant
Copy link

mm-assistant bot commented May 16, 2023

We recommend using English or English & Chinese for issues so that we could have broader discussion.

@zprzlcr
Copy link

zprzlcr commented May 30, 2023

把batch_size调小试试

@WenchuLiu
Copy link

learning rate调小试试

@zsh-zsh-chlid
Copy link

learning rate默认的是八个gpu 调小lr/8即可

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants