-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
channel error #10
Comments
我也碰到了一样的错误,不知道是不是由于我处理音频特征的方式有问题导致的 |
你好,请问你是如何处理音频特征的呢,我是通过作者引用的VOCA的那一篇论文,提取出的(N,16, 29)维度的音频特征 |
我是和你一样的做法,报错的过程也一样 |
我对这个问题感到很奇怪,他有一步c = c.reshape(-1,16,29)操作,就是默认了输入维度是(-1,16,29) |
后续通过网络不应该报错的才对 |
只能等作者后续的修正了,可能是某几步的参数填错了导致的 |
老师让我在近期复现这篇论文的baseline,我现在不知道该怎么做了 |
|
刚刚讨论了一下,它代码里attnet的seq_len设置的是8,有可能是作者选取了8个16*29的特征作为这一帧图片对应的音频特征 |
你说的对,谢谢解答 |
You are right. Sorry for missing the details. The dim of each audio feature '0_0.npy' corresponding to each frame is 81629. |
Hello, |
The author did not provide a code for this, but I found the answer through another paper. The title of the paper is: Neural Voice Puppetry:
Audio-driven facial reenactment, which has the corresponding code for the audio data processing part. In addition, I have encountered some difficulties in the process of reproducing this paper, if you successfully reproduce the paper, can you share it with me
…------------------ 原始邮件 ------------------
发件人: "sstzal/DiffTalk" ***@***.***>;
发送时间: 2023年7月21日(星期五) 晚上10:37
***@***.***>;
***@***.******@***.***>;
主题: Re: [sstzal/DiffTalk] channel error (Issue #10)
Hello,
I have the same error.
How do you get the 8 16*29 for each frame? When I make use of the deepspeech extraction (with video_fps=25 instead of original 60) I get (1, 16, 29) for each frame. How do you extend it to get (8, 16, 29)?
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
Thanks for your response. Yeah I've seen that paper, I'll try and share if I am successful :) |
Hello, did you successfully reproduce this paper? As a result of my training, the inpainting area will keep shaking, and then the training loss will drop rapidly at the beginning, and then it will shake within a small area. I feel very happy troubled |
您好,请问你复现的怎么样,方便加个v讨论一下彼此复现的结果吗 |
可以呀,加一下我的微信:lingjianman,加不上的话给我你的微信
…------------------ 原始邮件 ------------------
发件人: "sstzal/DiffTalk" ***@***.***>;
发送时间: 2023年7月31日(星期一) 晚上7:21
***@***.***>;
***@***.******@***.***>;
主题: Re: [sstzal/DiffTalk] channel error (Issue #10)
您好,请问你复现的怎么样,方便加个v讨论一下彼此复现的结果吗
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
|
已添加
…---Original---
From: ***@***.***>
Date: Mon, Jul 31, 2023 19:37 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [sstzal/DiffTalk] channel error (Issue #10)
可以呀,加一下我的微信:lingjianman,加不上的话给我你的微信
------------------ 原始邮件 ------------------
发件人: "sstzal/DiffTalk" ***@***.***>;
发送时间: 2023年7月31日(星期一) 晚上7:21
***@***.***>;
***@***.******@***.***>;
主题: Re: [sstzal/DiffTalk] channel error (Issue #10)
您好,请问你复现的怎么样,方便加个v讨论一下彼此复现的结果吗
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you authored the thread.Message ID: ***@***.***>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
Yes, it's (1, 16, 29) for each frame after deepspeech. Then you can concat the audio features from its neighbor 8 frames to the audio feature of [8,16,29] for each frame. For example, for the 10th frame, we use the audio feature of [7th, 8th, ..., 10th, ..., 14th] as the smooth audio feature of 10th frame. |
在处理音频信息的时候,网络要求输入维度是(B, 16, 29),c.reshape(-1,16,29)也可以确认网络的输入维度信息,我输入的音频信息与其一致,经过c = self.cond_stage_model_for_audio_smooth(c)的时候报错RuntimeError: Given groups=1, weight of size [16, 32, 3], expected input[20, 8, 8] to have 32 channels, but got 8 channels instead
The text was updated successfully, but these errors were encountered: