A curated list of Early Exiting papers, benchmarks, and misc. Currently, the resources listed in this repo are mainly in the field of natural language processing. Adding papers of early exiting in other fields (e.g. computer vision) is also welcome. (This repo is constantly updated.)
Early Exiting is an efficient technique that trains a deep model with multiple injected internal classifiers (exits) such that test samples can selectively exit instead of passing through the entire model.
Early exiting methods usually add internal classifiers to different layers of a model. By training these internal classifiers with the ground truth, the model has a chance to predict the correct label and exit earlier during inference. Current early exiting methods can be divided into two branches: 1. Dynamic methods and 2. Static methods.
Dynamic Early Exiting methods typically have two steps: (a) Training the internal classifiers on downstream tasks to make them capable of making predictions, (b) Designing an exiting strategy to decide whether to exit early or continue to the next layer.
Static Early Exiting methods assign each test sample to a specific layer by learning the difficulty of the samples or heuristically pre-defining the assignment of samples.
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline. NAACL 2022.
Xiangyang Liu*, Tianxiang Sun*, Junliang He, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu. [pdf][website]
-
Dynamic Neural Networks: A Survey. Preprint Feb 2021.
Yizeng Han, Gao Huang, Shiji Song, Le Yang, Honghui Wang, Yulin Wang. [pdf]
-
A Survey on Dynamic Neural Networks for Natural Language Processing. Preprint Feb 2022.
Canwen Xu, Julian McAuley. [pdf]
-
Split Computing and Early Exiting for Deep Learning Applications: Survey and Research Challenges. Preprint Mar 2021.
Yoshitomo Matsubara, Marco Levorato, Francesco Restuccia. [pdf]
-
Adaptive Inference through Early-Exit Networks: Design, Challenges and Directions. EMDL 2021.
Stefanos Laskaridis, Alexandros Kouris, Nicholas D. Lane. [pdf]
-
A Survey on Green Deep Learning. Preprint 2021.
Jingjing Xu, Wangchunshu Zhou, Zhiyi Fu, Hao Zhou, Lei Li. [pdf]
-
Efficient Methods for Natural Language Processing: A Survey. Preprint 2022.
Marcos Treviso, Tianchu Ji, Ji-Ung Lee, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Pedro H. Martins, André F. T. Martins, Peter Milder, Colin Raffel, Edwin Simpson, Noam Slonim, Niranjan Balasubramanian, Leon Derczynski, Roy Schwartz. [pdf]
-
DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference. ACL 2020.
Ji Xin, Raphael Tang, Jaejun Lee, Yaoliang Yu, and Jimmy Lin. [pdf]
-
The Right Tool for the Job: Matching Model and Instance Complexities. ACL 2020.
Roy Schwartz, Gabriel Stanovsky, Swabha Swayamdipta, Jesse Dodge, and Noah A. Smith. [pdf]
-
FastBERT: a Self-distilling BERT with Adaptive Inference Time. ACL 2020.
Weijie Liu, Peng Zhou, Zhiruo Wang, Zhe Zhao, Haotang Deng, and Qi Ju. [pdf]
-
Early Exiting BERT for Efficient Document Ranking. ACL 2020.
Ji Xin, Rodrigo Nogueira, Yaoliang Yu, Jimmy Lin. [pdf]
-
BERT Loses Patience: Fast and Robust Inference with Early Exit. NeurIPS 2020.
Wangchunshu Zhou, Canwen X u, Tao Ge, Julian McAuley, Ke Xu, Furu Wei. [pdf]
-
DynaBERT: Dynamic BERT with Adaptive Width and Depth. NeurlPS 2020.
Lu Hou, Zhiqi Huang, Lifeng Shang, Xin Jiang, Xiao Chen, Qun Liu. [pdf]
-
A Global Past-Future Early Exit Method for Accelerating Inference of Pre-trained Language Models. NAACL 2021.
Kaiyuan Liao, Yi Zhang, Xuancheng Ren, Qi Su, Xu Sun, Bin He. [pdf]
-
RomeBERT: Robust Training of Multi-Exit BERT. Preprint Jan 2021.
Shijie Geng, Peng Gao, Zuohui Fu, Yongfeng Zhang. [pdf]
-
BERxiT: Early Exiting for BERT with Better Fine-Tuning and Extension to Regression. EACL 2021.
Ji Xin, Raphael Tang, Yaoliang Yu, Jimmy Lin. [pdf]
-
Accelerating BERT Inference for Sequence Labeling via Early-Exit. ACL 2021.
Xiaonan Li, Yunfan Shao, Tianxiang Sun, Hang Yan, Xipeng Qiu, Xuanjing Huang. [pdf]
-
LeeBERT: Learned Early Exit for BERT with Cross-Level Optimization. ACL 2021.
Wei Zhu. [pdf]
-
TR-BERT: Dynamic Token Reduction for Accelerating BERT Inference. ACL 2021.
Deming Ye, Yankai Lin, Yufei Huang, Maosong Sun. [pdf]
-
EBERT: Efficient BERT Inference with Dynamic Structured Pruning. ACL 2021.
Zejian Liu, Fanrong Li, Gang Li, Jian Cheng. [pdf]
-
Class Means as an Early Exit Decision Mechanism. Preprint Mar 2021.
Alperen Gormez, Erdem Koyuncu. [pdf]
-
Early Exiting with Ensemble Internal Classifiers. Preprint May 2021.
Tianxiang Sun, Yunhua Zhou, Xiangyang Liu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu. [pdf]
-
ELBERT: Fast Albert with Confidence-Window Based Early Exit. Preprint Jul 2021.
Keli Xie, Siyuan Lu, Meiqi Wang, Zhongfeng Wang. [pdf]
-
CascadeBERT: Accelerating Inference of Pre-trained Language Models via Calibrated Complete Models Cascade. EMNLP 2021.
Lei Li, Yankai Lin, Deli Chen, Shuhuai Ren, Peng Li, Jie Zhou, Xu Sun. [pdf]
-
Consistent Accelerated Inference via Confident Adaptive Transformers. EMNLP 2021.
Tal Schuster, Adam Fisch, Tommi Jaakkola, Regina Barzilay. [pdf]
-
DACT-BERT: Differentiable Adaptive Computation Time for an Efficient BERT Inference. Preprint Sep 2021.
Cristóbal Eyzaguirre, Felipe del Río, Vladimir Araujo, Álvaro Soto. [pdf]
-
Towards Efficient NLP: A Standard Evaluation and A Strong Baseline. NAACL 2022.
Xiangyang Liu*, Tianxiang Sun*, Junliang He, Lingling Wu, Xinyu Zhang, Hao Jiang, Zhao Cao, Xuanjing Huang, Xipeng Qiu. [pdf]
-
GAML-BERT: Improving BERT Early Exiting by Gradient Aligned Mutual Learning. EMNLP 2021
Wei Zhu, Xiaoling Wang, Yuan Ni, Guotong Xie. [pdf]
-
PALBERT: Teaching ALBERT to Ponder. NeurIPS 2022
Nikita Balagansky, Daniil Gavrilov. [pdf]
-
TangoBERT: Reducing Inference Cost by using Cascaded Architecture. Preprint 2022
Jonathan Mamou, Oren Pereg, Moshe Wasserblat, Roy Schwartz. [pdf]
-
AdapLeR: Speeding up Inference by Adaptive Length Reduction. ACL 2022
Ali Modarressi, Hosein Mohebbi, Mohammad Taher Pilehvar. [pdf]
-
BE3R: BERT based Early-Exit Using Expert Routing. KDD 2022
Sourab Mangrulkar, S. AnkithM., Vivek Sembium. [pdf]
-
Confident Adaptive Language Modeling. NeurIPS 2022
Tal Schuster, Adam Fisch, Jai Gupta, Mostafa Dehghani, Dara Bahri, Vinh Q. Tran, Yi Tay, Donald Metzler. [pdf]
-
COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency with Slenderized Multi-exit Language Models. EMNLP 2022
Bowen Shen, Zheng Lin, Yuanxin Liu, Zhengxiao Liu, Lei Wang, Weiping Wang. [pdf]
-
E-LANG: Energy-Based Joint Inferencing of Super and Swift Language Models. ACL 2022
Mohammad Akbari, Amin Banitalebi-Dehkordi, Yong Zhang. [pdf]
-
Length-Adaptive Transformer: Train Once with Length Drop, Use Anytime with Search. ACL 2021
Gyuwan Kim, Kyunghyun Cho. [pdf]
-
Model Cascading: Towards Jointly Improving Efficiency and Accuracy of NLP Systems. EMNLP 2022
Neeraj Varshney, Chitta Baral. [pdf]
-
PCEE-BERT: Accelerating BERT Inference via Patient and Confident Early Exiting. NAACL 2022
Zhen Zhang, Wei Zhu, Jinfan Zhang, Peng Wang, Rize Jin, Tae-Sun Chung. [pdf]
-
Pyramid-BERT: Reducing Complexity via Successive Core-set based Token Selection. ACL 2022
Xin Huang, Ashish Khetan, Rene Bidart, Zohar Karnin. [pdf]
-
SkipBERT: Efficient Inference with Shallow Layer Skipping. ACL 2022
Jue Wang, Ke Chen, Gang Chen, Lidan Shou, Julian McAuley. [pdf]
-
Transkimmer: Transformer Learns to Layer-wise Skim. ACL 2022
Yue Guan, Zhengyi Li, Jingwen Leng, Zhouhan Lin, Minyi Guo. [pdf]
-
Unsupervised Early Exit in DNNs with Multiple Exits. Preprint 2022
Hari Narayan N U, Manjesh K. Hanawal, Avinash Bhardwaj. [pdf]
-
PoWER-BERT: Accelerating BERT Inference via Progressive Word-vector Elimination. ICML 2020
Saurabh Goyal, Anamitra R. Choudhury, Saurabh M. Raje, Venkatesan T. Chakaravarthy, Yogish Sabharwal, Ashish Verma. [pdf]
-
Accelerating Inference for Pretrained Language Models by Unified Multi-Perspective Early Exiting. COLING 2022
Jun Kong, Jin Wang, Liang-Chih Yu, Xuejie Zhang. [pdf]
-
Depth-Adaptive Transformer. ICLR 2020.
Maha Elbayad, Jiatao Gu, Edouard Grave, Michael Auli. [pdf]
-
Faster Depth-Adaptive Transformers. AAAI 2021.
Yijin Liu, Fandong Meng, Jie Zhou, Yufeng Chen, Jinan Xu. [pdf]
-
A Simple Hash-Based Early Exiting Approach For Language Understanding and Generation. Findings of ACL 2022.
Tianxiang Sun, Xiangyang Liu, Wei Zhu, Zhichao Geng, Lingling Wu, Yilong He, Yuan Ni, Guotong Xie, Xuanjing Huang, Xipeng Qiu [pdf]
👍🎉 First off, thanks for taking the time to contribute! 🎉👍
Steps to contribute:
- Make your awesome changes
- Submit pull request; if you add a new entry, please give a very brief explanation why you think it should be added.