Purpose of using 1D convolutions #52

rubbiyasultan · 2023-09-20T09:25:33Z

Hello,

Could you explain me the purpose of using 1D convolutions in the encoder layer?

self.attention = attention
self.conv1 = nn.Conv1d(in_channels=d_model, out_channels=d_ff, kernel_size=1)
 self.conv2 = nn.Conv1d(in_channels=d_ff, out_channels=d_model, kernel_size=1)
 self.norm1 = nn.LayerNorm(d_model)
self.norm2 = nn.LayerNorm(d_model)

The text was updated successfully, but these errors were encountered:

Leopold2333 · 2023-11-03T04:02:01Z

Hello,

Could you explain me the purpose of using 1D convolutions in the encoder layer?

self.attention = attention
self.conv1 = nn.Conv1d(in_channels=d_model, out_channels=d_ff, kernel_size=1)
 self.conv2 = nn.Conv1d(in_channels=d_ff, out_channels=d_model, kernel_size=1)
 self.norm1 = nn.LayerNorm(d_model)
self.norm2 = nn.LayerNorm(d_model)

Seems that now many implementations use Con1d as the "MLP" projection to transform the original input to latent space embeddings, especially Transformers in the recent few years. Maybe just base on claims, you could try a Linear layer and see the effects. It's also worth noting that for Conv layers, there are some weight parameter initialization methods such as Kaiming He's method, which may leads to better performance.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Purpose of using 1D convolutions #52

Purpose of using 1D convolutions #52

rubbiyasultan commented Sep 20, 2023

Leopold2333 commented Nov 3, 2023

Purpose of using 1D convolutions #52

Purpose of using 1D convolutions #52

Comments

rubbiyasultan commented Sep 20, 2023

Leopold2333 commented Nov 3, 2023