-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
memory not free after backward #13
Comments
Which example were you running for the experiment above? |
|
From the results you posted, it is difficult to tell when the GPU memory consumption increases. But the memory after the backward run (16.38MB)is pretty much the same as the memory before scheduling (16.37MB), which is expected. In addition, this is not a good example for memory study because it is way too small.
|
Thanks for your help. But the GPU memory for each train epoch should be the same before doing forward like run "pred_y = ode.odeint_adjoint(batch_y0, batch_t)", for example, it should always be around 16MB or 17MB every epoch. however, if run this code for 100 epoch or more, the GPU memory keep increase, increase about 0.1-0.2 MB per epoch in this case. I also run it in more complex cases. So is it normal? |
You can run the code line by line in a debugger and monitor the memory consumption to find out when the memory used increases by 0,1MB. |
OK, thanks for your help. |
Hi, I try to use the discrete adjoint method and print the GPU memory usage during training.
memory before scheduling: 16.37 MB
Memory after scheduling: 16.38 MB
Memory after backward pass: 16.38 MB
Iter 0020 | Total Loss 0.609421
memory before scheduling: 16.48 MB
Memory after scheduling: 16.49 MB
Memory after backward pass: 16.49 MB
Iter 0040 | Total Loss 0.599637
memory before scheduling: 16.59 MB
Memory after scheduling: 16.59 MB
Memory after backward pass: 16.59 MB
Iter 0060 | Total Loss 0.530792
memory before scheduling: 16.70 MB
Memory after scheduling: 16.70 MB
Memory after backward pass: 16.70 MB
Iter 0080 | Total Loss 0.893818
memory before scheduling: 16.80 MB
Memory after scheduling: 16.81 MB
I add those lines in the code:
memory_before_scheduling = show_net_dyn_memory_usage()
pred_y = ode.odeint_adjoint(batch_y0, batch_t)
# memory_before_scheduling = show_net_dyn_memory_usage()
# pred_y = scheduling(epsilon=0)
# memory_after_scheduling = show_net_dyn_memory_usage()
loss = torch.mean(torch.abs(pred_y - batch_y))
memory_after_scheduling = show_net_dyn_memory_usage()
loss.backward()
memory_after_backward = show_net_dyn_memory_usage()
# memory_after_backward = show_net_dyn_memory_usage()
optimizer.step()
The GPU memory increases and the Memory after the backward is not the same as the memory before scheduling 16.37 MB, I think it will always be around 16.37 MB if the memory used during the backward is freed. I believe the original Pytorch backward will free the memory, so could you help me understand it?
If I make a mistake or the code implementation uses more memory,
Thanks!
The text was updated successfully, but these errors were encountered: