Skip to main content
Log in

Predicting declining and growing occupations using supervised machine learning

  • Research Article
  • Published:
Journal of Computational Social Science Aims and scope Submit manuscript

Abstract

In the United States (U.S.), structural changes in the economy remain varied, yet continuous, prompting the need for regular analyses of both declining and growing occupations. As automation, robotization, and digitization continues to accelerate and drive new patterns of economic change, so does the need for proactive programs and policies aimed at targeted workforce re-training. Applying machine learning (ML) to occupational data provides one potential approach to inform such workforce initiatives, specifically by helping to predict both declining and growing occupations with advanced accuracy. In this paper, we examine the extent to which occupational attributes are predictive of the declining and growing status of jobs in the State of Ohio (USA). In particular, we examine the results from five distinct supervised ML models (i.e., multinomial logistic regression, nearest neighbors, random forest, adaptive boosting, and gradient boosting), and data on the characteristics of occupations from O*NET, as well as information on employment changes from the U.S. Bureau of Labor Statistics. We found that the random forest and gradient boosting models perform the best, predicting declining and growing jobs in Ohio at roughly 92% accuracy in the test set. Moreover, our analysis revealed that the most important features in predicting declining occupations are physical (e.g., spending time making repetitive motions), while the most important features in predicting growing occupations are related to obtaining information and communication. Our method can be replicated at a local or regional level to help practitioners predict future occupational shifts, ultimately enhancing economic and workforce development efforts.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2

Similar content being viewed by others

Data availability statement

The datasets analyzed in this study are derived from the following public domain resources: O*NET: https://www.onetonline.org/help/onet/database; BLS OES: https://www.bls.gov/oes/data-overview.htm.

Notes

  1. For an overview on O*NET data, visit https://www.onetcenter.org/programvideos.html#overview.

  2. The gaming cage workers occupation has grown in Ohio from 40 employees in 2011, to 400 employees in 2018. It was, thus, removed as an outlier.

References

  1. Foster-McGregor, N., & Verspagen, B. (2019). The role of structural change in the economic development of Asian economies. Asian Development Review, 33(2), 74–93. https://doi.org/10.1162/ADEV_a_00073

    Article  Google Scholar 

  2. Mironov, V. V., & Konovalova, L. D. (2019). Structural changes and economic growth in the world economy and Russia. Russian Journal of Economics, 5(1), 1–26. https://doi.org/10.32609/j.ruje.5.35233

  3. Porfirio, L. L., Newth, D., Finnigan, J. J., & Cai, Y. (2018). Economic shifts in agricultural production and trade due to climate change. Palgrave Communications, 4, 1–9. https://doi.org/10.1057/s41599-018-0164-y

    Article  Google Scholar 

  4. van Neuss, L. (2016). The drivers of structural change. Journal of Economic Surveys, 33(1), 309–349. https://doi.org/10.1111/joes.12266

    Article  Google Scholar 

  5. Bilbao-Osorio, B., & Rodriguez-Pose, A. (2004). From R&D to innovation and economic growth in the EU. Growth and Change, 35(4), 434–455. https://doi.org/10.1111/j.1468-2257.2004.00256.x

    Article  Google Scholar 

  6. Chang, Y., & Schorfheide, F. (2003). Labor-supply shifts and economic fluctuations. Journal of Monetary Economics, 50(8), 1751–1768. https://doi.org/10.1016/j.jmoneco.2003.02.001

    Article  Google Scholar 

  7. Hyclak, T. (1996). Structural changes in labor demand and unemployment in local labor markets. Journal of Regional Science, 36(4), 653–653. https://doi.org/10.1111/j.1467-9787.1996.tb01123.x

    Article  Google Scholar 

  8. Partridge, M. D., & Rickman, D. S. (1995). Differences in state unemployment rates: The role of labor and product market structural shifts. Southern Economic Journal, 62(1), 89–106. https://doi.org/10.2307/1061378

    Article  Google Scholar 

  9. Johnson, G. E. (1997). Changes in earnings inequality: The role of demand shifts. Journal of Economic Perspectives, 11(2), 41–54. https://doi.org/10.1257/jep.11.2.41

    Article  Google Scholar 

  10. Tomaskovic-Devey, D., & Lin, K. H. (2011). Income dynamics, economic rents, and the financialization of the U.S. economy. American Sociological Review, 76(4), 538–559. https://doi.org/10.1177/0003122411414827

    Article  Google Scholar 

  11. Bartel, A., Ichniowski, C., & Shaw, K. (2007). How does information technology affect productivity? Plant-level comparisons of product innovation, process improvement, and worker skills. Quarterly Journal of Economics, 122(4), 1721–1758. https://doi.org/10.1162/qjec.2007.122.4.1721

    Article  Google Scholar 

  12. Bessen, J. (2019). Automation and jobs: When technology boosts employment. Economic Policy, 34(100), 589–626. https://doi.org/10.1093/epolic/eiaa001

    Article  Google Scholar 

  13. Autor, D. H., Dorn, D., & Hanson, G. H. (2016). The China shock: Learning from labor-market adjustment to large changes in trade. Annual Review of Economics, 8, 205–240. https://doi.org/10.1146/annurev-economics-080315-015041

    Article  Google Scholar 

  14. Dix-Carneiro, R., & Kovak, B. K. (2019). Margins of labor market adjustment to trade. Journal of International Economics, 117, 125–142. https://doi.org/10.1016/j.jinteco.2019.01.005

    Article  Google Scholar 

  15. Lacey, T. A., & Wright, B. (2009). Occupational employment projections to 2018. Monthly Labor Review, 132(11), 82–123. Retrieved from https://www.bls.gov/opub/mlr/2009/article/occupational-employment-projections-to-2018.htm

  16. Neumark, D., Johnson, H., & Mejia, M. C. (2013). Future skill shortages in the U.S. economy? Economics of Education Review, 32, 151–167. https://doi.org/10.1016/j.econedurev.2012.09.004

    Article  Google Scholar 

  17. Wingrove, P., Liaw, W., Weiss, J., Petterson, S., Maier, J., & Bazemore, A. (2020). Using machine learning to predict primary care and advance workforce research. The Annals of Family Medicine, 18(4), 334–340. https://doi.org/10.1370/afm.2550

    Article  Google Scholar 

  18. Kang, I. G., Croft, B., & Bichelmeyer, B. A. (2020). Predictors of turnover intention in U.S. federal government workforce: Machine learning evidence that perceived comprehensive HR practices predict turnover intention. Public Personnel Management, 50(4), 538–558. https://doi.org/10.1177/0091026020977562

  19. Brynjolfsson, E., & Mitchell, T. (2017). What can machine learning do? Workforce implications. Science, 358(6370), 1530–1534. https://doi.org/10.1126/science.aap8062

    Article  Google Scholar 

  20. Dawson, N., Rizoiu, M.-A., Johnston, B., & Williams, M.-A. (2020). Predicting labor shortages from labor demand and labor supply data: A machine learning approach. Retrieved from https://arxiv.org/abs/2004.01311

  21. U.S. Bureau of Labor Statistics. (2020). Worker displacement news release: 2017–19. Retrieved from https://www.bls.gov/news.release/disp.htm.

  22. Holzer, H. J., & LaFarge Jr., J. (2019). The US labor market in 2050: Supply, demand and policies to improve outcomes. The Brookings Institution. Retrieved from https://www.brookings.edu/wp-content/uploads/2019/05/201905_Holzer-The-US-Labor-Market-in-2050-Supply-Demand-and-Public-Policy.pdf

  23. Maxim, R., & Muro, M. (2019). Automation and AI will disrupt the American labor force. Here’s how we can protect workers. The Brookings Institution. Retrieved from https://www.brookings.edu/blog/the-avenue/2019/02/25/automation-and-ai-will-disrupt-the-american-labor-force-heres-how-we-can-protect-workers/

  24. Gomes, O., & Pereira, S. (2019). On the economic consequences of automation and robotics. Journal of Economic and Administrative Sciences, 36(2), 134–153. https://doi.org/10.1108/JEAS-04-2018-0049

    Article  Google Scholar 

  25. Jean, N., Burke, M., Xie, M., Davis, W. M., Lobell, D. B., & Ermon, S. (2016). Combining satellite imagery and machine learning to predict poverty. Science, 353(6301), 790–794. https://doi.org/10.1126/science.aaf7894

    Article  Google Scholar 

  26. Mele, M., & Magazzino, C. (2020). A machine learning analysis of the relationship among iron and steel industries, air pollution, and economic growth in China. Journal of Cleaner Production, 277, 123293. https://doi.org/10.1016/j.jclepro.2020.123293

    Article  Google Scholar 

  27. Mullainathan, S., & Spiess, J. (2017). Machine learning: An applied econometric approach. Journal of Economic Perspectives, 31(2), 87–106. https://doi.org/10.1257/jep.31.2.87

    Article  Google Scholar 

  28. Storm, H., Baylis, K., & Heckelei, T. (2019). Machine learning in agricultural and applied economics. European Review of Agricultural Economics, 47(3), 849–892. https://doi.org/10.1093/erae/jbz033

    Article  Google Scholar 

  29. Brummitt, C. D., Gomez-Lievano, A., Hausmann, R., & Bonds, M. H. (2020). Machine-learned patterns suggest that diversification drives economic development. Journal of the Royal Society Interface. https://doi.org/10.1098/rsif.2019.0283

    Article  Google Scholar 

  30. Handel, M. J. (2016). The O*NET content model: Strengths and limitations. Journal of Labour Market Research, 49, 157–176. https://doi.org/10.1007/s12651-016-0199-8

    Article  Google Scholar 

  31. Burrus, J., Jackson, T., Xi, N., & Steinberg, J. (2014). Identifying the most important 21st century workforce competencies: An analysis of the Occupational Information Network (O*NET). ETS Research Report Series, 2, i–55. https://doi.org/10.1002/j.2333-8504.2013.tb02328.x

    Article  Google Scholar 

  32. Rus, C. L., Tomsa, A. R., Rebega, O. L., & Apostol, L. (2012). Teachers’ professional identity: A content analysis. Procedia- Social and Behavioral Sciences, 78, 315–319. https://doi.org/10.1016/j.sbspro.2013.04.302

    Article  Google Scholar 

  33. Lordan, G., & Pischke, J. S. (2016). Does Rosie like riveting? Male and female occupational choices (NBER Working Paper No. 22495). National Bureau of Economic Research. Retrieved from https://www.nber.org/papers/w22495.

  34. Denning, J. T., Jacob, B., Lefgren, L., & vom Lehn, C. (2019). The return to hours worked within and across occupations: Implications for the gender wage gap (NBER Working Paper No. w25739). National Bureau of Economic Research. Retrieved from https://ideas.repec.org/p/nbr/nberwo/25739.html

  35. Farooq, A., & Kugler, A. (2016). Beyond job lock: Impacts of public health insurance on occupational and industrial mobility (NBER Working Paper No. 22118). National Bureau of Economic Research. Retrieved from https://www.nber.org/papers/w22118.

  36. Brynjolfsson, E., Mitchell, T., & Rock, D. (2018). What can machines learn, and what does it mean for occupations and the economy? AEA Papers and Proceedings, 108, 43–47. https://doi.org/10.1257/pandp.20181019

    Article  Google Scholar 

  37. Agrawal, A., Gans, J. S., & Goldfarb, A. (2019). Artificial intelligence: The ambiguous labor market impact of automating prediction. Journal of Economic Perspectives, 33(2), 31–50. https://doi.org/10.1257/jep.33.2.31

    Article  Google Scholar 

  38. Vona, F., Marin, G., Consoli, D., & Popp, D. (2018). Environmental regulation and green skills: An empirical exploration. Journal of the Association of Environmental and Resource Economists Econ, 5(4), 713–753. https://doi.org/10.1086/698859

    Article  Google Scholar 

  39. Jolley, G. J., Khalaf, C., Michaud, G., & Sandler, A. M. (2019). The economic, fiscal, and workforce impacts of coal-fired power plant closures in Appalachian Ohio. Regional Science Policy & Practice, 11(2), 403–422. https://doi.org/10.1111/rsp3.12191

    Article  Google Scholar 

  40. U.S. Bureau of Labor Statistics. (2019). Occupational outlook handbook. Retrieved from https://www.bls.gov/ooh/about/ooh-faqs.htm#growth.

  41. Agrawal, A., Gans, J., & Goldfarb, A. (2019). The economics of artificial intelligence: An agenda. Chicago: University of Chicago Press.

    Book  Google Scholar 

  42. Panch, T., Szolovits, P., & Atun, R. (2018). Artificial intelligence, machine learning and health systems. Journal of Global Health, 8(2), 020303. https://doi.org/10.7189/jogh.08.020303

    Article  Google Scholar 

  43. Ikudo, A., Lane, J. I., Staudt, J., & Weinberg, B. A. (2019). Occupational classifications: A machine learning approach. Journal of Economic and Social Measurement, 44(2–3), 57–87. https://doi.org/10.3233/JEM-190463

    Article  Google Scholar 

  44. Bandiera, O., Prat, A., Hansen, S., & Sadun, R. (2020). CEO behavior and firm performance. Journal of Political Economy, 128(4), 1325–1369. https://doi.org/10.1086/705331

    Article  Google Scholar 

  45. Ke, Z. T., Kelly, B. T., & Xiu, D. (2019). Predicting returns with text data (NBER Working Paper No. 26186). National Bureau of Economic Research. Retrieved from https://ideas.repec.org/p/nbr/nberwo/26186.html.

  46. Price, J., Buckles, K., Van Leeuwen, J., & Riley, I. (2019). Combining family history and machine learning to link historical records (NBER Working Paper No. 26227). National Bureau of Economic Research. Retrieved from https://economics.yale.edu/sites/default/files/price_et_al_2019_ada-ns.pdf.

  47. Layton, T., Liebert, H., Maestas, N., & Prinz, D. (2019). Predicting disability enrollment using machine learning (NBER Working Paper No. NB18-Q4). National Bureau of Economic Research. Retrieved from https://www.nber.org/center-papers/nb18-q4.

  48. Klein, S. P., Berk, R. A., & Hickman, L. J. (2006). Race and the decision to seek the death penalty in federal cases. Rand Corporation. Retrieved from https://www.rand.org/pubs/technical_reports/TR389.html.

  49. Roberts, D. R., Bahn, V., Ciuti, S., Boyce, M. S., Elith, J., Guillera-Arroita, G., Hauenstein, S., Lahoz-Monfort, J. J., Schroder, B., Thuiller, W., Warton, D. I., Wintle, B. A., Hartig, F., & Dormann, C. F. (2017). Cross-validation strategies for data with temporal, spatial, hierarchical, or phylogenetic structure. Ecography, 40(8), 913–929. https://doi.org/10.1111/ecog.02881

    Article  Google Scholar 

  50. Tashman, L. J. (2000). Out-of-sample tests of forecasting accuracy: An analysis and review. International Journal of Forecasting, 16(4), 437–450. https://doi.org/10.1016/S0169-2070(00)00065-0

    Article  Google Scholar 

  51. Aly, M. (2005). Survey on multiclass classification methods. Caltech. Retrieved from http://www.mohamedaly.info/publications

  52. Bishop, C. M. (2006). Pattern recognition and machine learning (Vol. 4, No. 4, p. 738). New York: Springer.

  53. Alam, T., Ahmed, C. F., Zahin, S. A., Khan, M. A. H., & Islam, M. T. (2018). An effective ensemble method for multi-class classification and regression for imbalanced data. In Advances in Data Mining. Applications and Theoretical Aspects: 18th Industrial Conference, ICDM 2018, New York, NY, USA, July 11–12, 2018, Proceedings 18 (pp. 59–74). Springer International Publishing.

  54. Iwendi, C., Khan, S., Anajemba, J. H., & MittalAly, M. (2005). Survey on multiclass., Alenezi, M., & Alazab, M. (2020). The use of ensemble models for multiple class and binary class classification methods for improving intrusion detection systems. Sensors, 20(9), 2559. https://doi.org/10.3390/s20092559

  55. Goldberger, J., Hinton, G. E., Roweis, S., & Salakhutdinov, R. R. (2004). Neighbourhood components analysis. Proceedings of the Conference on Caltech.Information Processing Systems (NIPS), 513–520. Retrieved from https://proceedings.neurips.cc/paper/2004/hash/42fe880812925e520249e808937738d2-Abstract.html

  56. Brown, G. (2010). Ensemble learning. In C. Sammut & G. I. Webb (Eds.), Encyclopedia of Machine Learning. Retrieved from http://www.mohamedaly.info/publicationscs.man.ac.uk/~gbrown/research/brown10ensemblelearning.pdf

  57. Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123–140. https://doi.org/10.1023/A:1018054314350

    Article  Google Scholar 

  58. Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32. https://doi.org/10.1023/A:1010933404324

    Article  Google Scholar 

  59. Breiman, L., Friedman, J., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Chapman & Hall/CRC.

  60. Hastie, T., Tibshirani, R., & Friedman, J. (2009). The elements of statistical learning (Vol. 2). Berlin: Springer.

    Book  Google Scholar 

  61. Hastie, T., Rosset, S., Zhu, J., & Zou, H. (2009). Multi-class AdaBoost. Statistics and its. Interface, 2, 349–360. https://doi.org/10.4310/SII.2009.v2.n3.a8

    Article  Google Scholar 

  62. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29(5), 1189–1232. https://doi.org/10.1214/aos/1013203451

    Article  Google Scholar 

  63. Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics & Data Analysis, 38(4), 367–378. https://doi.org/10.1016/S0167-9473(01)00065-2

    Article  Google Scholar 

  64. Grandini, M., Bagli, E., & Visani, G. (2020). Metrics for multi-class classification: An overview. arXiv preprint arXiv:2008.05756.

  65. Mosley, L. (2013). A balanced approach to the multi-class imbalance problem [Doctoral dissertation, Iowa State University]. Retrieved from https://lib.dr.iastate.edu/etd/13537/

  66. Eugenio, B. D., & Glass, M. (2004). The Kappa statistic: A second look. Computational Linguistics, 30(1), 95–101. https://doi.org/10.1162/089120104773633402

    Article  Google Scholar 

  67. Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874. https://doi.org/10.1016/j.patrec.2005.10.010

    Article  Google Scholar 

  68. Strobl, C., Boulesteix, A. L., Zeileis, A., & Hothorn, T. (2007). Bias in random forest variable importance measures: Illustrations, sources and a solution. BMC Bioinformatics, 8(1), 1–21. https://doi.org/10.1186/1471-2105-8-25

    Article  Google Scholar 

  69. Manyika, J., Lund, S., Chui, M., Bughin, J., Woetzel, J., Batra, P& Sanghvi, S. (2017). Jobs lost, jobs gained: What the future of work will mean for jobs, skills, and wages. Retrieved from https://www.mckinsey.com/featured-insights/future-of-work/jobs-lost-jobs-gained-what-the-future-of-work-will-mean-for-jobs-skills-and-wages#/

  70. Whiting, K. (2020). These are the top 10 job skills of tomorrow–and how long it takes to learn them. In World Economic Forum (Vol. 21).

  71. Feser, E. J., & Bergman, E. M. (2000). National industry cluster templates: A framework for applied regional cluster analysis. Regional Studies, 34(1), 1–19. https://doi.org/10.1080/00343400050005844

    Article  Google Scholar 

  72. Porter, M. E. (2000). Location, competition, and economic development: Local clusters in a global economy. Economic Development Quarterly, 14(1), 15–34. https://doi.org/10.1177/089124240001400105

    Article  Google Scholar 

  73. Jolley, G. J., & Khalaf, C. (2020). Skillshed analysis as a tool to inform workforce training programs. Economic Development in Higher Education, 3, 1–5.

    Google Scholar 

  74. Khalaf, C., Michaud, G., & Jolley, G. J. (2021). Toward a new rural typology: Mapping resources, opportunities, and challenges. Economic Development Quarterly, 36(3), 276–293. https://doi.org/10.1177/08912424211069122

    Article  Google Scholar 

  75. Haixiang, G., Yijing, L., Shang, J., Mingyun, G., Yuanyue, H., & Bing, G. (2017). Learning from class-imbalanced data: Review of methods and applications. Expert Systems With Applications, 73, 220–239. https://doi.org/10.1016/j.eswa.2016.12.035

    Article  Google Scholar 

  76. Lemaître, G., Nogueira, F., & Aridas, C. K. (2017). Imbalanced-learn: A python toolbox to tackle the curse of imbalanced datasets in machine learning. The Journal of Machine Learning Research, 18(1), 559–563.

    Google Scholar 

Download references

Funding

The authors declare that no funds, grants, or other support were received during the preparation of this manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christelle Khalaf.

Ethics declarations

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A

See Figs. 3, 4, 5, 6, 7 and Table 4, 5.

Fig. 3
figure 3

Receiver operating characteristic, multinomial logistic regression

Fig. 4
figure 4

Receiver operating characteristic, nearest neighbors

Fig. 5
figure 5

Receiver operating characteristic, random forest

Fig. 6
figure 6

Receiver operating characteristic, adaptive boosting

Fig. 7
figure 7

Receiver operating characteristic, gradient boosting

Table 4 Predicting declining and growing occupations (full sample)
Table 5 Predicting declining and growing occupations (test set)

Appendix B

See Table 6.

Table 6 Important occupational skills predictors

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Khalaf, C., Michaud, G. & Jolley, G.J. Predicting declining and growing occupations using supervised machine learning. J Comput Soc Sc 6, 757–780 (2023). https://doi.org/10.1007/s42001-023-00211-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s42001-023-00211-0

Keywords

JEL Classification

Navigation