Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about deblur trainning #117

Open
Davidcoach opened this issue Jul 5, 2022 · 12 comments
Open

about deblur trainning #117

Davidcoach opened this issue Jul 5, 2022 · 12 comments

Comments

@Davidcoach
Copy link

hello, I degraded the image in FFHQ and want to use the debur process in MPRnet to restore it.
But, when I train the model, I first met this problem
Traceback (most recent call last):
File "/home/ma-user/work/MPRnet/train.py", line 120, in
loss_char = np.sum([criterion_char(restored[j],target) for j in range(len(restored))])
File "<array_function internals>", line 6, in sum
File "/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 2260, in sum
initial=initial, where=where)
File "/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/numpy/core/fromnumeric.py", line 86, in wrapreduction
return ufunc.reduce(obj, axis, dtype, out, **passkwargs)
File "/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/torch/tensor.py", line 621, in array
return self.numpy()
TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
So I changed the loss function to this
loss_char = torch.tensor([criterion_char(restored[j],target) for j in range(len(restored))]).sum()
loss_edge = torch.tensor([criterion_edge(restored[j],target) for j in range(len(restored))]).sum()
loss = ((loss_char) + (0.05*loss_edge)).requires_grad
(True)
this seems work, but after trainning, I found that the PSNR is always 12.4697 and never change, the model learnd nothing from the data.
How can I do?

@KKKLeouee
Copy link

I also encounter the same problem. I want to ask you how to solve this problem in the end

@KKKLeouee
Copy link

你好,在 FFHQ 降了一个图像,想用我在 Rnet 中的 debur 过程中 我训练问题来恢复。,当模型时,第一次遇到这个 Traceback(最近一次调用最后一次): 文件“/home /ma-user/work/MPRnet/train.py”,第 120 行,在 loss_char = np.sum([ criteria_char(restored[j],target) for j in range(len(restored))]) 文件“< array_function internals>”,第 6 行, 总和文件“/home/ma-user/anaconda3/envs/PyTorch-1.8 /lib/python3.7/site-packages/numpy/core/fromnumeric.py”,第 2260 行,总而言之其 initial=initial,where=where) 文件“/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/numpy/core/fromnumeric.py”,第86行,在_wrapreduction 返回 ufunc.reduce(obj, axis, dtype, out, passkwargs) 文件“/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/torch/tensor.py ”,第 621 行,在指令中 return self.numpy** () TypeError: can't convert cuda:0 device type tensor to numpy.首先使用 Tenor.cpu() 将张量复制到 主机 。 tensor([criterion_char(restored[j],target) for j in range(len(restored))]).sum() loss_edge = torch.tensor([criterion_edge(restored [j],target) for j in range(len (restoredd.sum() loss = (0.05*lossedge))._ require_grad ( 并且​​经过训练,我似乎发现 PSNR 但这是 12.4697真实)能做 什么?

I also encounter the same problem. I want to ask you how to solve this problem in the end

@Davidcoach
Copy link
Author

Davidcoach commented Jul 20, 2022 via email

@adityac8
Copy link
Collaborator

This might help #91 (comment)

@userHLN
Copy link

userHLN commented Oct 23, 2022

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first.
请问这个问题怎么解决 我和楼主问题一样 也是用的python3.7

@Makohhh
Copy link

Makohhh commented Oct 31, 2022

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. 请问这个问题怎么解决 我和楼主问题一样 也是用的python3.7

请问你解决了吗?我也遇到了这个问题

@wpc0086
Copy link

wpc0086 commented Nov 25, 2022

TypeError: can't convert cuda:0 device type tensor to numpy. Use Tensor.cpu() to copy the tensor to host memory first. 请问这个问题怎么解决 我和楼主问题一样 也是用的python3.7

loss_char = sum([criterion_char(restored[j],target) for j in range(len(restored))]) 将 np.sum改为sum就可以了

@drifterss
Copy link

请问大家训练过程中遇到了损失函数突然很大的情况吗?我训练到20轮的时候损失函数就爆炸了

@Feecuin
Copy link

Feecuin commented Apr 16, 2024

请问大家训练过程中遇到了损失函数突然很大的情况吗?我训练到20轮的时候损失函数就爆炸了

老哥问题解决了吗

@Turing2022
Copy link

请问大家训练过程中遇到了损失函数突然很大的情况吗?我训练到20轮的时候损失函数就爆炸了
我也是 我还怀疑是我batch_size调到2的原因呢

@Turing2022
Copy link

请问大家训练过程中遇到了损失函数突然很大的情况吗?我训练到20轮的时候损失函数就爆炸了

老哥问题解决了吗

现在怎么样了?

@Feecuin
Copy link

Feecuin commented Dec 11, 2024

请问大家训练过程中遇到了损失函数突然很大的情况吗?我训练到20轮的时候损失函数就爆炸了

老哥问题解决了吗

现在怎么样了?

没解决哈哈哈哈,我没有研究这个算法了

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants