Author ORCID Identifier

http://orcid.org/0000-0002-8398-1531

Degree Year

2020

Document Type

Thesis - Open Access

Degree Name

Bachelor of Arts

Department

Computer Science

Advisor(s)

John L. Donaldson

Keywords

Machine learning, NLP, Deep learning

Abstract

Since the first bidirectional deep learn- ing model for natural language understanding, BERT, emerged in 2018, researchers have started to study and use pretrained bidirectional autoencoding or autoregressive models to solve language problems. In this project, I conducted research to fully understand BERT and XLNet and applied their pretrained models to two language tasks: reading comprehension (RACE) and part-of-speech tagging (The Penn Treebank). After experimenting with those released models, I implemented my own version of ELECTRA, a pretrained text encoder as a discriminator instead of a generator to improve compute-efficiency, with BERT as its underlying architecture. To reduce the number of parameters, I replaced BERT with ALBERT in ELEC- TRA and named the new model, ALE (A Lite ELECTRA). I compared the performance of BERT, ELECTRA, and ALE on GLUE benchmark dev set after pretraining them with the same datasets for the same amount of training FLOPs.

Repository Citation

Shao, Han, "Pretraining Deep Learning Models for Natural Language Understanding" (2020). Honors Papers. 709.
https://digitalcommons.oberlin.edu/honors/709

Download

Included in

Computer Sciences Commons

COinS

Honors Papers

Pretraining Deep Learning Models for Natural Language Understanding

Author ORCID Identifier

Degree Year

Document Type

Degree Name

Department

Advisor(s)

Keywords

Abstract

Repository Citation

Included in

Search

Browse

Author Corner

Links

Honors Papers

Pretraining Deep Learning Models for Natural Language Understanding

Author

Author ORCID Identifier

Degree Year

Document Type

Degree Name

Department

Advisor(s)

Keywords

Abstract

Repository Citation

Included in

Share

Search

Browse

Author Corner

Links