Skip to main content

Estimating intergenerational income mobility on sub-optimal data: a machine learning approach


Much of the global evidence on intergenerational income mobility is based on sub-optimal data. In particular, two-stage techniques are widely used to impute parental incomes for analyses of lower-income countries and for estimating long-run trends across multiple generations and historical periods. We propose applying machine learning methods to improve the reliability and comparability of such estimates. Supervised learning algorithms minimize the out-of-sample prediction error in the parental income imputation and provide an objective criterion for choosing across different specifications of the first-stage equation. We use our approach on data from the United States and South Africa to show that under common conditions it can limit the bias generally associated to mobility estimates based on imputed parental income.


  • Aaronson, D., Mazumder, B.: Intergenerational economic mobility in the United States, 1940 to 2000. J. Hum. Resour. 43(1), 139–172 (2008)

    Google Scholar 

  • Arlot, S., Celisse, A.: A survey of cross-validation procedures for model selection. Stat. Surv. 4(2010), 40–79 (2010)

  • Athey, S., Imbens, G.W.: The state of applied econometrics: causality and policy evaluation. J. Econ. Perspect. 31(2), 3–32 (2017)

    Article  Google Scholar 

  • Belloni, A., Chernozhukov, V., Hansen, C.: High-dimensional methods and inference on structural and treatment effects. J. Econ. Perspect. 28(2), 29–50 (2014)

    Article  Google Scholar 

  • Björklund, A., Jäntti, M.: Intergenerational income mobility in Sweden compared to the United States. Am. Econ. Rev. 87(4), 1009–1018 (1997)

    Google Scholar 

  • Björklund, A., Jäntti, M.: Intergenerational income mobility and the role of family background. In: Salverda, W., Nolan, B., Smeeding, T. (eds.) Handbook of Economic Inequality. Oxford University Press, Oxford (2009)

  • Blanden, J.: Cross-country rankings in intergenerational mobility: a comparison of approaches from economics and sociology. J. Econ. Surv. 27(1), 38–73 (2013)

    Article  Google Scholar 

  • Blundell, J., Risa, E.: Income and family background: are we using the right models?Available at SSRN: or (2019)

  • Brunori, P., Ferreira, F.H.G., Peragine, V., Piraino, P., Van der Weide, R., Bloise, F., Gupta, R., Gasparini, L., Lakner, C., Luppi, F., Mahler, D., Narayan, A., Neidhöfer, G., Palmisano, F., Randazzo, T., Rampino, T., Serlenga, L., Serrano, J., Triventi, M.: Equal chances: equality of opportunity and intergenerational mobility around the world. University of Bari, mimeo (2020)

  • Chen, W.-H., Ostrovsky, Y., Piraino, P.: Lifecycle variation, errors-in-variables bias and nonlinearities in intergenerational income transmission: new evidence from Canada. Labour Econ. 44, 1–12 (2017)

    Article  Google Scholar 

  • Chetty, R., Hendren, N., Kline, P., Saez, E.: Where is the land of opportunity? The geography of intergenerational mobility in the United States. Quart. J. Econ. 129(4), 1553–1623 (2014)

    Article  Google Scholar 

  • Clark, G.: The son also rises: surnames and the history of social mobility. Princeton University Press,Princeton (2014)

  • Corak, M.: Do poor children become poor adults? Lessons from a cross-country comparison of generational earnings mobility. Res. Econ. Inequality 13(1), 143–188 (2006)

    Article  Google Scholar 

  • Corak, M.: Income inequality, equality of opportunity, and intergenerational mobility. J. Econ. Perspect. 27(3), 79–102 (2013)

    Article  Google Scholar 

  • Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32, 407–451 (2004)

    Article  Google Scholar 

  • Emran, M.S., Shilpi, F.J.: Economic approach to intergenerational mobility: Measures, methods, and challenges in developing countries. (No. 2019/98). UNU-WIDER Working Paper(2019)

  • Finn, A., Leibbrandt, M., Ranchhod, V.: Patterns of persistence: Intergenerational mobility and education in South Africa. Cape Town: SALDRU, UCT. (SALDRU Working Paper Number 175/ NIDS Discussion Paper 2016/2)(2017)

  • Gong, H., Leigh, A., Meng, X.: Intergenerational income mobility in urban China. Rev. Income Wealth 58(3), 481–503 (2012)

    Article  Google Scholar 

  • Haider, S., Solon, G.: Life-cycle variation in the association between current and lifetime earnings. Am. Econ. Rev. 96(4), 1308–1320 (2006)

    Article  Google Scholar 

  • Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning: data mining, inference, and prediction. Springer Science & Business Media, Berlin (2009)

  • Hastie, T., Tibshirani, R., Tibshirani, R.J.: Extended comparisons of best subset selection, forward stepwise selection, and the lasso. arXiv preprint arXiv:1707.08692(2017)

  • James, G., Witten, D., Hastie, T., Tibshirani, R.: “An introduction to statistical learning”. Springer, New York (2013)

  • Jerrim, J., Choi, A., Simancas, R.: Two-Sample Two-Stage Least Squares (TSTSLS) estimates of earnings mobility: how consistent are they? Surv. Res. Methods 10(2), 85–102 (2016)

    Google Scholar 

  • McKenzie, D., Sansone, D.: Predicting entrepreneurial success is hard: Evidence from a business plan competition in Nigeria. J. Dev. Econ. 141, 1–18.(2019)

  • Meinshausen, N.: Relaxed Lasso. Comput. Stat. Data Anal. 52, 374–393 (2007)

  • Mullainathan, S., Spiess, J.: Machine learning: an applied econometric approach. J. Econ. Perspect. 31(2), 87–106 (2017)

    Article  Google Scholar 

  • Narayan, A., Van der Weide, R., Cojocaru, A., Lakner, C., Redaelli, S., Mahler, D.G., Ramasubbaiah, R.G., Thewissen, S.: Fair Progress? In: Economic mobility across generations around the world. World Bank Group,Washington, DC (2018)

  • Nybom, M., Stuhler, J.: Heterogeneous income profiles and lifecycle bias in intergenerational mobility estimation. J. Hum. Resour. 51(1), 239–268 (2016)

    Article  Google Scholar 

  • Olivetti, C., Paserman, D.: In the name of the son (and the Daughter): intergenerational mobility in the United States. Am. Econ. Rev. 105(8), 1850–1940 (2015)

    Article  Google Scholar 

  • Piraino, P.: Intergenerational earnings mobility and equality of opportunity in South Africa. World Dev. 67, 396–405 (2015)

    Article  Google Scholar 

  • Santavirta, T., Stuhler, J.: Name-based estimators of intergenerational mobility. Mimeo., Stockholm University (2020)

  • Schonlau, M.: BOOST: Stata module to perform boosted regression. Available at: (2018)

  • Solon, G.: Intergenerational income mobility in the United States. Am. Econ. Rev. 82(3), 393–408 (1992)

    Google Scholar 

  • Solon, G.: Cross-country differences in intergenerational earnings mobility. J. Econ. Perspect. 16(3), 59–66 (2002)

    Article  Google Scholar 

  • Townsend, W.: ELASTICREGRESS: Stata module to perform elastic net regression, lasso regression, ridge regression. Available at: (2017)

  • Varian, H.: Big data: new tricks for econometrics. J. Econ. Perspect. 28(2), 3–27 (2014)

    Article  Google Scholar 

  • Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Roy. Stat. Soc. B 67.2, 301–320.” (2005)

    Article  Google Scholar 

  • Zou, Y.R., Schonlau, M.: RFOREST: Stata module to implement Random Forest algorithm. Available at: (2019)

Download references


We acknowledge financial support from the Italian Ministry of Education and Research, SIR Grant Project N. RBSI14KDMF.


Open access funding provided by Università degli Studi Roma Tre within the CRUI-CARE Agreement.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Francesco Bloise.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Bloise, F., Brunori, P. & Piraino, P. Estimating intergenerational income mobility on sub-optimal data: a machine learning approach. J Econ Inequal 19, 643–665 (2021).

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI:


  • Intergenerational income mobility
  • Machine learning
  • Two-sample two-stage least squares

JEL classification

  • J62
  • D31
  • D63