Skip to content

Latest commit

 

History

History
2 lines (2 loc) · 296 Bytes

File metadata and controls

2 lines (2 loc) · 296 Bytes

The-Benefits-of-Normalization-Layers-in-Transformers

This is Kangqi's final project of DDA6202 Optimization Models and Methods in Machine Learning. My project is about the nomalization layer choice in transformer encoder. To this end, I utilized several tests proposed by previous researchers.