A machine learning approach to improving occupational income scores
Abstract
Historical studies of labor markets frequently lack data on individual income. The occupational income score (OCCSCORE) is often used as an alternative measure of labor market outcomes. We consider the consequences of using OCCSCORE when researchers are interested in earnings regressions. We estimate race and gender earnings gaps in modern decennial Censuses as well as the 1915 Iowa State Census. Using OCCSCORE biases results towards zero and can result in estimated gaps of the wrong sign. We use a machine learning approach to construct a new adjusted score based on industry, occupation, and demographics. The new income score provides estimates closer to earnings regressions. Lastly, we consider the consequences for estimates of intergenerational mobility elasticities.
Repository Citation
Saavedra, Martin, and Tate Twinam. 2020. "A machine learning approach to improving occupational income scores." Explorations in Economic History 75: 101304.
Publisher
Elsevier
Publication Date
1-1-2020
Publication Title
Explorations in Economic History
Department
Economics
Document Type
Article
DOI
https://dx.doi.org/10.1016/j.eeh.2019.101304
Keywords
OCCSCORE, Occupational income score, LIDO Score, Machine learning, Lasso, Non-classical measurement error, Occupation, Earnings gaps
Language
English
Format
text