Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory is not freed when process is cancelled #115

Open
vectris-dev opened this issue Aug 11, 2021 · 0 comments
Open

Memory is not freed when process is cancelled #115

vectris-dev opened this issue Aug 11, 2021 · 0 comments

Comments

@vectris-dev
Copy link

If I interrupt the process in windows using CTRL+C and then attempt to run it again I get the following error

Imagining "Galaxy_of_ghosts" ... c:\programdata\anaconda3\lib\site-packages\torch\nn\functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at ..\c10/core/TensorImpl.h:1156.) return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode) loss: -13.45: 0%| | 1/1050 [00:01<17:40, 1.01s/it] epochs: 0%| | 0/20 [00:01<?, ?it/s] Traceback (most recent call last): | 0/420.0 [00:00<?, ?it/s] File "c:\programdata\anaconda3\lib\runpy.py", line 194, in _run_module_as_main return _run_code(code, main_globals, None, File "c:\programdata\anaconda3\lib\runpy.py", line 87, in _run_code exec(code, run_globals) File "C:\ProgramData\Anaconda3\Scripts\dream.exe\__main__.py", line 7, in <module> File "c:\programdata\anaconda3\lib\site-packages\big_sleep\cli.py", line 74, in main fire.Fire(train) File "c:\programdata\anaconda3\lib\site-packages\fire\core.py", line 141, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "c:\programdata\anaconda3\lib\site-packages\fire\core.py", line 466, in _Fire component, remaining_args = _CallAndUpdateTrace( File "c:\programdata\anaconda3\lib\site-packages\fire\core.py", line 681, in _CallAndUpdateTrace component = fn(*varargs, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\cli.py", line 71, in train imagine() File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\big_sleep.py", line 499, in forward out, loss = self.train_step(epoch, i, image_pbar) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\big_sleep.py", line 447, in train_step out, losses = self.model(self.encoded_texts["max"], self.encoded_texts["min"]) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\big_sleep.py", line 260, in forward image_embed = perceptor.encode_image(into) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\clip.py", line 519, in encode_image return self.visual(image.type(self.dtype)) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\clip.py", line 410, in forward x = self.transformer(x) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\clip.py", line 381, in forward return self.resblocks(x) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\container.py", line 139, in forward input = module(input) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\clip.py", line 368, in forward x = x + self.attention(self.ln_1(x)) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl return forward_call(*input, **kwargs) File "c:\programdata\anaconda3\lib\site-packages\big_sleep\clip.py", line 340, in forward ret = super().forward(x.type(torch.float32)) File "c:\programdata\anaconda3\lib\site-packages\torch\nn\modules\normalization.py", line 173, in forward return F.layer_norm( File "c:\programdata\anaconda3\lib\site-packages\torch\nn\functional.py", line 2346, in layer_norm return torch.layer_norm(input, normalized_shape, weight, bias, eps, torch.backends.cudnn.enabled) RuntimeError: CUDA out of memory. Tried to allocate 20.00 MiB (GPU 0; 8.00 GiB total capacity; 5.18 GiB already allocated; 3.44 MiB free; 5.36 GiB reserved in total by PyTorch) image update: 0%| | 0/420.0 [00:01<?, ?it/s]

Shouldn't the memory have been freed up when I cancelled the previous process? This is after a fresh restart with nothing else running.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant