Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

对model overview的困惑需要向您请教下 #8

Open
wangxin-fighting opened this issue Aug 10, 2024 · 2 comments
Open

对model overview的困惑需要向您请教下 #8

wangxin-fighting opened this issue Aug 10, 2024 · 2 comments

Comments

@wangxin-fighting
Copy link

model
您好,我有两个疑惑。
1:请问论文中图2的model overview里面,X和Xk-1两个有什么区别?
2:不同的transformer层是怎么连接的?比如我看您的代码里面只有encoder,没有decoder,请问第一层Transformer(encoder)的输出是什么,它是怎么传递到第二层作为输入的呢?

@xcyao00
Copy link
Owner

xcyao00 commented Aug 10, 2024

  1. X是输入整个Transformer重构网络的输入特征,X_k-1是用于区分Transformer网络里面每一层的特征,因此X_0就是X。
  2. 不同Transformer层就直接堆叠在一起,代码中就是EncoderLayer堆叠在一起;Transformer的Encoder和Decoder本身其实是差不多的,在语言模型中Decoder中会有corss-attention,在视觉模型中,一般都只有Encoder没有Decoder,如ViT中;我们的模型是很多EncoderLayer顺序组合在一起,这里这个Transformer模型是用于重构输入特征,所以叫Decoder可能更合适。

@wangxin-fighting
Copy link
Author

谢谢您的及时回复,祝您工作和学习顺遂。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants