Author ORCID Identifier
Degree Year
2020
Document Type
Thesis - Open Access
Degree Name
Bachelor of Arts
Department
Computer Science
Advisor(s)
John L. Donaldson
Keywords
Machine learning, NLP, Deep learning
Abstract
Since the first bidirectional deep learn- ing model for natural language understanding, BERT, emerged in 2018, researchers have started to study and use pretrained bidirectional autoencoding or autoregressive models to solve language problems. In this project, I conducted research to fully understand BERT and XLNet and applied their pretrained models to two language tasks: reading comprehension (RACE) and part-of-speech tagging (The Penn Treebank). After experimenting with those released models, I implemented my own version of ELECTRA, a pretrained text encoder as a discriminator instead of a generator to improve compute-efficiency, with BERT as its underlying architecture. To reduce the number of parameters, I replaced BERT with ALBERT in ELEC- TRA and named the new model, ALE (A Lite ELECTRA). I compared the performance of BERT, ELECTRA, and ALE on GLUE benchmark dev set after pretraining them with the same datasets for the same amount of training FLOPs.
Repository Citation
Shao, Han, "Pretraining Deep Learning Models for Natural Language Understanding" (2020). Honors Papers. 709.
https://digitalcommons.oberlin.edu/honors/709