💬 Discord •
This repository aims to implement SOTA efficient token/channel mixers. Any technologies related to non-Vanilla Transformer are welcome. If you are interested in this repository, please join our Discord.
- Token Mixers
- Linear Attention
- Linear RNN
- Long Convolution
- Channel Mixers
- Add special init.
- LLaMA.
- Add data type for class and function.
- long_conv_1d_op.
- Gtu.