channel error #10

jianmanLin · 2023-07-19T06:31:45Z

    elif cond_class == "audio":
        if self.cond_stage_forward is None:
            bs = c.shape[0] # 20
            c = c.reshape(-1,16,29) # [20, 16, 29]
            c = self.cond_stage_model_for_audio(c) # [20, 64]
            c = c.reshape(bs, 8, -1) # [20, 8, 8]
            c = self.cond_stage_model_for_audio_smooth(c)

在处理音频信息的时候，网络要求输入维度是（B， 16, 29），c.reshape(-1,16,29)也可以确认网络的输入维度信息，我输入的音频信息与其一致，经过c = self.cond_stage_model_for_audio_smooth(c)的时候报错RuntimeError: Given groups=1, weight of size [16, 32, 3], expected input[20, 8, 8] to have 32 channels, but got 8 channels instead

The text was updated successfully, but these errors were encountered:

979277 · 2023-07-19T07:40:00Z

我也碰到了一样的错误，不知道是不是由于我处理音频特征的方式有问题导致的

jianmanLin · 2023-07-19T07:48:39Z

输入音频经过DeepSpeech和窗口化处理后输出（-1,16,29）作者所给的代码也是接受这个维度的，可以成功推过self.cond_stage_model_for_audio这个网络，但是无法成功推理过self.cond_stage_model_for_audio_smooth网络，可以猜测self.cond_stage_model_for_audio_smooth网络的输出为（-1,32），因为我想跑通整个网络，所以随机初始化了（-1,32）的张量做为音频输出，进行后续的推理，但是后面还是遭遇到了维度不一致问题

jianmanLin · 2023-07-19T07:53:22Z

我也碰到了一样的错误，不知道是不是由于我处理音频特征的方式有问题导致的

你好，请问你是如何处理音频特征的呢，我是通过作者引用的VOCA的那一篇论文，提取出的（N，16, 29）维度的音频特征

979277 · 2023-07-19T07:54:01Z

我是和你一样的做法，报错的过程也一样

jianmanLin · 2023-07-19T07:54:29Z

我也碰到了一样的错误，不知道是不是由于我处理音频特征的方式有问题导致的

我对这个问题感到很奇怪，他有一步c = c.reshape(-1,16,29)操作，就是默认了输入维度是（-1,16,29）

jianmanLin · 2023-07-19T07:54:58Z

我也碰到了一样的错误，不知道是不是由于我处理音频特征的方式有问题导致的

我对这个问题感到很奇怪，他有一步c = c.reshape(-1,16,29)操作，就是默认了输入维度是（-1,16,29）

后续通过网络不应该报错的才对

979277 · 2023-07-19T08:01:39Z

只能等作者后续的修正了，可能是某几步的参数填错了导致的

jianmanLin · 2023-07-19T08:03:56Z

只能等作者后续的修正了，可能是某几步的参数填错了导致的

老师让我在近期复现这篇论文的baseline，我现在不知道该怎么做了

jianmanLin · 2023-07-19T08:25:18Z

我也碰到了一样的错误，不知道是不是由于我处理音频特征的方式有问题导致的
应该是我们的音频处理方式不对，音频部分最终输出应该是（B， 64）的，这样整个模型就可以跑通了

979277 · 2023-07-19T08:34:24Z

只能等作者后续的修正了，可能是某几步的参数填错了导致的

老师让我在近期复现这篇论文的baseline，我现在不知道该怎么做了

刚刚讨论了一下，它代码里attnet的seq_len设置的是8，有可能是作者选取了8个16*29的特征作为这一帧图片对应的音频特征

jianmanLin · 2023-07-19T08:39:41Z

只能等作者后续的修正了，可能是某几步的参数填错了导致的

老师让我在近期复现这篇论文的baseline，我现在不知道该怎么做了

刚刚讨论了一下，它代码里attnet的seq_len设置的是8，有可能是作者选取了8个16*29的特征作为这一帧图片对应的音频特征

你说的对，谢谢解答

sstzal · 2023-07-19T09:52:53Z

只能等作者后续的修正了，可能是某几步的参数填错了导致的

老师让我在近期复现这篇论文的baseline，我现在不知道该怎么做了

刚刚讨论了一下，它代码里attnet的seq_len设置的是8，有可能是作者选取了8个16*29的特征作为这一帧图片对应的音频特征

You are right. Sorry for missing the details. The dim of each audio feature '0_0.npy' corresponding to each frame is 81629.

Bebaam · 2023-07-21T14:36:52Z

Hello,
I have the same error.
How do you get the 8 16*29 for each frame? When I make use of the deepspeech extraction (with video_fps=25 instead of original 60) I get (1, 16, 29) for each frame. How do you extend it to get (8, 16, 29)?

jianmanLin · 2023-07-21T14:43:40Z

The author did not provide a code for this, but I found the answer through another paper. The title of the paper is: Neural Voice Puppetry: Audio-driven facial reenactment, which has the corresponding code for the audio data processing part. In addition, I have encountered some difficulties in the process of reproducing this paper, if you successfully reproduce the paper, can you share it with me

…

------------------ 原始邮件 ------------------ 发件人: "sstzal/DiffTalk" ***@***.***>; 发送时间: 2023年7月21日(星期五) 晚上10:37 ***@***.***>; ***@***.******@***.***>; 主题: Re: [sstzal/DiffTalk] channel error (Issue #10) Hello, I have the same error. How do you get the 8 16*29 for each frame? When I make use of the deepspeech extraction (with video_fps=25 instead of original 60) I get (1, 16, 29) for each frame. How do you extend it to get (8, 16, 29)? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

Bebaam · 2023-07-21T16:37:30Z

Thanks for your response. Yeah I've seen that paper, I'll try and share if I am successful :)

jianmanLin · 2023-07-28T09:34:13Z

我也碰到了一样的错误，不知道是不是由于我处理音频特征的方式有问题导致的

Hello, did you successfully reproduce this paper? As a result of my training, the inpainting area will keep shaking, and then the training loss will drop rapidly at the beginning, and then it will shake within a small area. I feel very happy troubled

979277 · 2023-07-31T11:21:07Z

您好，请问你复现的怎么样，方便加个v讨论一下彼此复现的结果吗

jianmanLin · 2023-07-31T11:37:10Z

可以呀，加一下我的微信：lingjianman，加不上的话给我你的微信

…

------------------ 原始邮件 ------------------ 发件人: "sstzal/DiffTalk" ***@***.***>; 发送时间: 2023年7月31日(星期一) 晚上7:21 ***@***.***>; ***@***.******@***.***>; 主题: Re: [sstzal/DiffTalk] channel error (Issue #10) 您好，请问你复现的怎么样，方便加个v讨论一下彼此复现的结果吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***>

979277 · 2023-07-31T14:19:29Z

已添加

…

---Original--- From: ***@***.***> Date: Mon, Jul 31, 2023 19:37 PM To: ***@***.***>; Cc: ***@***.******@***.***>; Subject: Re: [sstzal/DiffTalk] channel error (Issue #10) 可以呀，加一下我的微信：lingjianman，加不上的话给我你的微信

------------------&nbsp;原始邮件&nbsp;------------------ 发件人: "sstzal/DiffTalk" ***@***.***&gt;; 发送时间:&nbsp;2023年7月31日(星期一) 晚上7:21 ***@***.***&gt;; ***@***.******@***.***&gt;; 主题:&nbsp;Re: [sstzal/DiffTalk] channel error (Issue #10) 您好，请问你复现的怎么样，方便加个v讨论一下彼此复现的结果吗 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.Message ID: ***@***.***&gt; — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

sstzal · 2023-12-27T02:31:45Z

Hello, I have the same error. How do you get the 8 16*29 for each frame? When I make use of the deepspeech extraction (with video_fps=25 instead of original 60) I get (1, 16, 29) for each frame. How do you extend it to get (8, 16, 29)?

Yes， it's (1, 16, 29) for each frame after deepspeech. Then you can concat the audio features from its neighbor 8 frames to the audio feature of [8,16,29] for each frame.

For example, for the 10th frame, we use the audio feature of [7th, 8th, ..., 10th, ..., 14th] as the smooth audio feature of 10th frame.

quqixun mentioned this issue Aug 22, 2023

The usage of RAM is always increasing during one epoch. #19

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

channel error #10

channel error #10

jianmanLin commented Jul 19, 2023

979277 commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

979277 commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

979277 commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

979277 commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

sstzal commented Jul 19, 2023

Bebaam commented Jul 21, 2023

jianmanLin commented Jul 21, 2023 via email

Bebaam commented Jul 21, 2023

jianmanLin commented Jul 28, 2023

979277 commented Jul 31, 2023

jianmanLin commented Jul 31, 2023 via email

979277 commented Jul 31, 2023 via email

sstzal commented Dec 27, 2023

channel error #10

channel error #10

Comments

jianmanLin commented Jul 19, 2023

979277 commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

979277 commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

979277 commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

979277 commented Jul 19, 2023

jianmanLin commented Jul 19, 2023

sstzal commented Jul 19, 2023

Bebaam commented Jul 21, 2023

jianmanLin commented Jul 21, 2023 via email

Bebaam commented Jul 21, 2023

jianmanLin commented Jul 28, 2023

979277 commented Jul 31, 2023

jianmanLin commented Jul 31, 2023 via email

979277 commented Jul 31, 2023 via email

sstzal commented Dec 27, 2023