Skip to content
/ EBERT Public

[ACL 2021 Findings] EBERT: Efficient BERT Inference with Dynamic Structured Pruning

License

Notifications You must be signed in to change notification settings

CAS-CLab/EBERT

Repository files navigation

EBERT

This repository serves as the official code release of the paper EBERT: Efficient BERT Inference with Dynamic Structured Pruning (pubilished at Findings of ACL 2021).

EBERT is a dynamic structured pruning algorithm for efficient BERT inference. Unlike previous methods that randomly prune the model weights for static inference, EBERT dynamically determines and prunes the unimportant heads in multi-head self-attention layers and the unimportant structured computations in feed-forward network for each input sample at run-time.

Prerequisites

The code has the following dependencies:

  • python >= 3.8.5
  • pytorch >= 1.4.0
  • transformers = 3.3.1 As transformers v3.3.1 has a bug when the evaluation strategy is epoch, you need to make the following changes in the transformers library:
--- a/src/transformers/training_args.py
+++ b/src/transformers/training_args.py
@@ -323,7 +323,7 @@ class TrainingArguments:
     def __post_init__(self):
         if self.disable_tqdm is None:
             self.disable_tqdm = logger.getEffectiveLevel() > logging.WARN
-        if self.evaluate_during_training is not None:
+        if self.evaluate_during_training:
             self.evaluation_strategy = (
                 EvaluationStrategy.STEPS if self.evaluate_during_training else EvaluationStrategy.NO
             )

Usages

We provide script files for training and validation in the scripts folder, and users can run these script from the repo root, e.g. bash scripts/eval_glue.sh. In each scripts, there are several arguments to modify before running:

  • --data_dir: path to dataset:GLUE, SQuAD.
  • MODEL_PATH or --model_name_or_path: path to trained model folder
  • TASK_NAME: task name in GLUE (SST-2, MNLI, ...)
  • RUN_NAME: name of the current experiment, which influence the save path and log name for wandb.
  • other hyper-parameters, e.g., head_mask_mode

You can download the original pretrained model of BERT and RoBERTa from HuggingFace.

Citation

If you found the library useful for your work, please kindly cite our work:

@inproceedings{liu-etal-2021-ebert,
    title = "{EBERT}: Efficient {BERT} Inference with Dynamic Structured Pruning",
    author = "Liu, Zejian  and
              Li, Fanrong  and
              Li, Gang  and
              Cheng, Jian",
    booktitle = "Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021",
    month = aug,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2021.findings-acl.425",
    doi = "10.18653/v1/2021.findings-acl.425",
    pages = "4814--4823",
}

About

[ACL 2021 Findings] EBERT: Efficient BERT Inference with Dynamic Structured Pruning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published