-
I use MetaAdam to complete inner training loop, everything was right and all tensors and models are on CUDA before self.qf1_optimizer.step(qf1_loss), I print the device of qf1_loss and found that it's on CUDA, I don't know why MetaAdam found tensor on cpu, here's my code using MetaAdam: meta_optimizer_class=TorchOpt.MetaAdam
self.policy_optimizer = meta_optimizer_class(
self.policy, lr=policy_lr, betas=(beta_1, 0.999), moment_requires_grad=False
) and here is error information: File "/NAS2020/Workspaces/DRLGroup/lymao/DLproject/h_divergence_meta_learning/ILSwiss/rlkit/torch/algorithms/bmg/bmg.py", line 153, in train_step
self.qf1_optimizer.step(qf1_loss)
File "/home/lymao/anaconda3/envs/meta/lib/python3.8/site-packages/torchopt/_src/optimizer/meta/base.py", line 69, in step
updates, new_state = self.impl.update(
File "/home/lymao/anaconda3/envs/meta/lib/python3.8/site-packages/torchopt/_src/transform.py", line 66, in update_fn
flattened_updates, state = inner.update(
File "/home/lymao/anaconda3/envs/meta/lib/python3.8/site-packages/torchopt/_src/base.py", line 183, in update_fn
updates, new_s = fn(updates, s, params=params, inplace=inplace)
File "/home/lymao/anaconda3/envs/meta/lib/python3.8/site-packages/torchopt/_src/transform.py", line 318, in update_fn
mu = _update_moment(
File "/home/lymao/anaconda3/envs/meta/lib/python3.8/site-packages/torchopt/_src/transform.py", line 221, in _update_moment
return map_flattened(f, updates, moments)
File "/home/lymao/anaconda3/envs/meta/lib/python3.8/site-packages/torchopt/_src/transform.py", line 51, in map_flattened
return list(map(func, *args))
File "/home/lymao/anaconda3/envs/meta/lib/python3.8/site-packages/torchopt/_src/transform.py", line 218, in f
return t.mul(decay).add_(g, alpha=1 - decay) if g is not None else t
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 5 replies
-
Hi @maoliyuan, I was wondering which version of torchopt you used that produced such bug report? You can try the following code snippet to get that import torchopt, numpy, sys
print(torchopt.__version__, numpy.__version__, sys.version, sys.platform) |
Beta Was this translation helpful? Give feedback.
-
@maoliyuan Hi, could you try your code with our latest dev version? We have built new wheels via GitHub Action, the artifacts can be found here Build #393. Download the Please follow the instructions on https://pytorch.org/ to upgrade your pip3 install torch torchvision torchaudio Then install the wheel: pip3 install torchopt-0.5.1.dev49+ga89bd4e-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl Thanks. |
Beta Was this translation helpful? Give feedback.
-
Thanks for reply! This problem was solved and I'm sorry that I didn't pay attention to the order of defining optimizer and loading model to CUDA, but I get another problem when using MetaAdam, In my code I didn't use sqrt function to compute loss, but in the outer loop when I backward with torch.optim.Adam I get error of "RuntimeError: Function 'SqrtBackward0' returned nan values in its 0th output.", then I switched MetaAdam to MetaSGD and nothing wrong happened , Is there anything wrong with my code of using MetaAdam? here's the full error information and my script: below is the full error information: /home/lymao/anaconda3/envs/meta/lib/python3.8/site-packages/torch/autograd/__init__.py:197: UserWarning: Error detected in SqrtBackward0. No forward pass information available. Enable detect anomaly during forward pass for more information. (Triggered internally at ../torch/csrc/autograd/python_anomaly_mode.cpp:92.)
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
Traceback (most recent call last):
File "run_scripts/bmg_exp_script.py", line 176, in <module>
experiment(exp_specs)
File "run_scripts/bmg_exp_script.py", line 134, in experiment
algorithm.train(start_epoch=epoch)
File "/NAS2020/Workspaces/DRLGroup/lymao/DLproject/h_divergence_meta_learning/ILSwiss/rlkit/core/base_algorithm.py", line 162, in train
self.start_training(start_epoch=start_epoch)
File "/NAS2020/Workspaces/DRLGroup/lymao/DLproject/h_divergence_meta_learning/ILSwiss/rlkit/core/base_algorithm.py", line 290, in start_training
self._try_to_train(epoch)
File "/NAS2020/Workspaces/DRLGroup/lymao/DLproject/h_divergence_meta_learning/ILSwiss/rlkit/core/base_algorithm.py", line 302, in _try_to_train
self._do_training(epoch)
File "/NAS2020/Workspaces/DRLGroup/lymao/DLproject/h_divergence_meta_learning/ILSwiss/rlkit/torch/algorithms/torch_meta_rl_algorithm.py", line 49, in _do_training
self.trainer.train_step(self.get_batch(), self.inner_train_steps_total, avg_reward_per_iter)
File "/NAS2020/Workspaces/DRLGroup/lymao/DLproject/h_divergence_meta_learning/ILSwiss/rlkit/torch/algorithms/bmg/bmg.py", line 210, in train_step
matching_loss.backward()
File "/home/lymao/anaconda3/envs/meta/lib/python3.8/site-packages/torch/_tensor.py", line 487, in backward
torch.autograd.backward(
File "/home/lymao/anaconda3/envs/meta/lib/python3.8/site-packages/torch/autograd/__init__.py", line 197, in backward
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass
RuntimeError: Function 'SqrtBackward0' returned nan values in its 0th output. below is my training script define optimizer out of the trainer
meta_optimizer_class = torchopt.MetaAdam
algorithm.trainer.policy_optimizer = meta_optimizer_class(
policy, lr=0.0001, moment_requires_grad=False
) train loop corresponds to Bootstrapped Meta Gradient update policy(using MetaAdam) with some loss that doesn't contain sqrt
if n_train_step_total % self.num_steps_per_loop == self.inner_loop_steps-1:
self.k_state_dict = TorchOpt.extract_state_dict(self.policy)
if n_train_step_total % self.num_steps_per_loop == self.total_steps_per_loop-2:
self.k_l_m1_state_dict = TorchOpt.extract_state_dict(self.policy)
if n_train_step_total % self.num_steps_per_loop == self.total_steps_per_loop-1:
matching_loss = self.matching_function(self.policy_k, self.policy, obs, self.k_state_dict) **here policy is just mlp without any sqrt function def matching_function(self, policy_k, tb, meta_observations, policy_k_state_dict):**
with torch.no_grad(): **here I don't want to calculate grad of self.policy**
\tab policy_outputs_tb = tb(meta_observations)
\tab policy_mean_tb, policy_log_std_tb = policy_outputs_tb[1], policy_outputs_tb[2]
TorchOpt.recover_state_dict(policy_k, policy_k_state_dict)
policy_outputs_k = policy_k(meta_observations)
policy_mean_k, policy_log_std_k = policy_outputs_k[1], policy_outputs_k[2]
div = self.matching_mean_coef * self.matching_loss(policy_mean_tb, policy_mean_k) + self.matching_std_coef * self.matching_loss(policy_log_std_tb, policy_log_std_k)
return div I'm sorry that I can't give you a pretty code and error information because I'm not familiar with Github's discussion, but I still hope you can help me! |
Beta Was this translation helpful? Give feedback.
-
#26, you can refer to this issue for the NAN bug in MetaAdam and the reasons for getting NAN. You can either set the use_accelerated_op as True, or register a hook to fliter the NAN bug. Here is an example. impl = torchopt.chain(torchopt.hook.register_hook(torchopt.hook.zero_nan_hook), torchopt.adam(1e-1))
inner_opt = torchopt.MetaOptimizer(net, impl) |
Beta Was this translation helpful? Give feedback.
@maoliyuan Hi, could you try your code with our latest dev version? We have built new wheels via GitHub Action, the artifacts can be found here Build #393. Download the
wheels
artifact (py38 / py39 / py310).Please follow the instructions on https://pytorch.org/ to upgrade your
torch
installation to 1.13.0:Then install the wheel:
Thanks.