Open
Description
hi, I modify the train.sh as :
python train.py --name resnet_radvani_32000_20190415 --model resnet --netD conv-up --batch_size 4 --max_dataset_size 32000 --niter 20 --niter_decay 50 --save_result_freq 250 --save_epoch_freq 2 --ndown 6 --data_root /home/liwensh/data
and run about 16 hours(44 epoch), and at the end of log.txt shows:
epoch 43 iter 6499: l1: 11.288669 tv: 1.522304 total: 11.288669
epoch 43 iter 6749: l1: 11.599895 tv: 0.667862 total: 11.599895
epoch 43 iter 6999: l1: 11.125267 tv: 1.277602 total: 11.125267
epoch 43 iter 7249: l1: 11.893361 tv: 1.366742 total: 11.893361
epoch 43 iter 7499: l1: 11.343329 tv: 1.228081 total: 11.343329
epoch 43 iter 7749: l1: 11.397069 tv: 1.426213 total: 11.397069
epoch 43 iter 7999: l1: 11.519998 tv: 0.664876 total: 11.519998
epoch 44 iter 249: l1: 11.183926 tv: 1.258252 total: 11.183926
epoch 44 iter 499: l1: 11.555054 tv: 1.201256 total: 11.555054
epoch 44 iter 749: l1: 12.041154 tv: 1.312884 total: 12.041154
epoch 44 iter 999: l1: 11.605458 tv: 0.706056 total: 11.605458
epoch 44 iter 1249: l1: 11.589639 tv: 1.093558 total: 11.589639
epoch 44 iter 1499: l1: 11.533211 tv: 1.338729 total: 11.533211
epoch 44 iter 1749: l1: 11.822362 tv: 1.297630 total: 11.822362
epoch 44 iter 1999: l1: 12.410873 tv: 1.159959 total: 12.410873
epoch 44 iter 2249: l1: 11.855060 tv: 1.531642 total: 11.855060
the total have not changed much since the 5th epoch.
and the inter output is strange(44 epoch):
I wonder if it it because the batch size is too small, since I have no enough GPU memory. Or maybe other option set I am wrong?
Metadata
Metadata
Assignees
Labels
No labels