AUBER: Automated BERT regularization
- PMID: 34181664
- PMCID: PMC8238198
- DOI: 10.1371/journal.pone.0253241
AUBER: Automated BERT regularization
Abstract
How can we effectively regularize BERT? Although BERT proves its effectiveness in various NLP tasks, it often overfits when there are only a small number of training instances. A promising direction to regularize BERT is based on pruning its attention heads with a proxy score for head importance. However, these methods are usually suboptimal since they resort to arbitrarily determined numbers of attention heads to be pruned and do not directly aim for the performance enhancement. In order to overcome such a limitation, we propose AUBER, an automated BERT regularization method, that leverages reinforcement learning to automatically prune the proper attention heads from BERT. We also minimize the model complexity and the action search space by proposing a low-dimensional state representation and dually-greedy approach for training. Experimental results show that AUBER outperforms existing pruning methods by achieving up to 9.58% better performance. In addition, the ablation study demonstrates the effectiveness of design choices for AUBER.
Conflict of interest statement
The authors have declared that no competing interests exist.
Figures
References
-
- Devlin J, Chang MW, Lee K, Toutanova K. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In: NAACL-HLT; 2019.
-
- Phang J, Févry T, Bowman SR. Sentence Encoders on STILTs: Supplementary Training on Intermediate Labeled-data Tasks. CoRR. 2018;abs/1811.01088.
-
- Zhang T, Wu F, Katiyar A, Weinberger KQ, Artzi Y. Revisiting Few-sample BERT Fine-tuning. CoRR. 2020;abs/2006.05987.
-
- Michel P, Levy O, Neubig G. Are Sixteen Heads Really Better than One? In: NeurIPS; 2019.
-
- Voita E, Talbot D, Moiseev F, Sennrich R, Titov I. Analyzing Multi-Head Self-Attention: Specialized Heads Do the Heavy Lifting, the Rest Can Be Pruned. In: ACL; 2019.
Publication types
MeSH terms
LinkOut - more resources
Full Text Sources
