A machine learning approach to improving occupational income scores
Historical studies of labor markets frequently lack data on individual income. The occupational income score (OCCSCORE) is often used as an alternative measure of labor market outcomes. We consider the consequences of using OCCSCORE when researchers are interested in earnings regressions. We estimate race and gender earnings gaps in modern decennial Censuses as well as the 1915 Iowa State Census. Using OCCSCORE biases results towards zero and can result in estimated gaps of the wrong sign. We use a machine learning approach to construct a new adjusted score based on industry, occupation, and demographics. The new income score provides estimates closer to earnings regressions. Lastly, we consider the consequences for estimates of intergenerational mobility elasticities.
Saavedra, Martin, and Tate Twinam. 2020. "A machine learning approach to improving occupational income scores." Explorations in Economic History 75: 101304.
Explorations in Economic History
OCCSCORE, Occupational income score, LIDO Score, Machine learning, Lasso, Non-classical measurement error, Occupation, Earnings gaps