Skip to main content

Strategic Workforce Planning with Deep Reinforcement Learning

  • Conference paper
  • First Online:
Machine Learning, Optimization, and Data Science (LOD 2022)

Abstract

This paper presents a simulation-optimization approach to strategic workforce planning based on deep reinforcement learning. A domain expert expresses the organization’s high-level, strategic workforce goals over the workforce composition. A policy that optimizes these goals is then learned in a simulation-optimization loop. Any suitable simulator can be used, and we describe how a simulator can be derived from historical data. The optimizer is driven by deep reinforcement learning and directly optimizes for the high-level strategic goals as a result. We compare the proposed approach with a linear programming-based approach on two types of workforce goals. The first type of goal, consisting of a target workforce, is relatively easy to optimize for but hard to specify in practice and is called operational in this work. The second, strategic, type of goal is a possibly non-linear combination of high-level workforce metrics. These goals can easily be specified by domain experts but may be hard to optimize for with existing approaches. The proposed approach performs significantly better on the strategic goal while performing comparably on the operational goal for both a synthetic and a real-world organization. Our novel approach based on deep reinforcement learning and simulation-optimization has a large potential for impact in the workforce planning domain. It directly optimizes for an organization’s workforce goals that may be non-linear in the workforce composition and composed of arbitrary workforce composition metrics.

Y. Smit and F. den Hengst—Authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The average number of direct reports of managers in the organization.

  2. 2.

    A metric to express responsibilities and expectations of a role in the organization, usually associated with compensation in some way.

  3. 3.

    For a model with \(n=30\) cohorts and \(X_{\text {max}}=100\) maximum employees per cohort, the number of transitions in the Markov chain is \(|\mathcal {S}\times \mathcal {S}| = \prod _{i=1}^{n}(S_{\text {max}}+1)^2 \approx 10^{120}\).

  4. 4.

    Code and data for hypothetical use case available at https://github.com/ysmit933/swp-with-drl-release. Real-life use case data will be made available upon request.

References

  1. April, J., Better, M., Glover, F.W., Kelly, J.P., Kochenberger, G.A.: Ensuring workforce readiness with optforce (2013). Unpublished manuscript retrieved from opttek.com

  2. Banyai, T., Landschutzer, C., Banyai, A.: Markov-chain simulation-based analysis of human resource structure: how staff deployment and staffing affect sustainable human resource strategy. Sustainability 10(10), 3692 (2018)

    Article  Google Scholar 

  3. Bhulai, S., Koole, G., Pot, A.: Simple methods for shift scheduling in multiskill call centers. Manuf. Serv. Oper. Manage. 10(3), 411–420 (2008)

    Article  Google Scholar 

  4. Burke, E.K., De Causmaecker, P., Berghe, G.V., Van Landeghem, H.: The state of the art of nurse rostering. J. Sched. 7(6), 441–499 (2004)

    Article  MathSciNet  MATH  Google Scholar 

  5. Cotten, A.: Seven steps of effective workforce planning. IBM Center for the Business of Government (2007)

    Google Scholar 

  6. Davis, M., Lu, Y., Sharma, M., Squillante, M., Zhang, B.: Stochastic optimization models for workforce planning, operations, and risk management. Serv. Sci. 10(1), 40–57 (2018)

    Article  Google Scholar 

  7. De Feyter, T., Guerry, M., et al.: Optimizing cost-effectiveness in a stochastic Markov manpower planning system under control by recruitment. Ann. Oper. Res. 253(1), 117–131 (2017)

    Article  MathSciNet  MATH  Google Scholar 

  8. Fei, Y., Yang, Z., Chen, Y., Wang, Z., Xie, Q.: Risk-sensitive reinforcement learning: Near-optimal risk-sample tradeoff in regret. Adv. Neural. Inf. Process. Syst. 33, 22384–22395 (2020)

    Google Scholar 

  9. Gaimon, C., Thompson, G.: A distributed parameter cohort personnel planning model that uses cross-sectional data. Manage. Sci. 30(6), 750–764 (1984)

    Article  MATH  Google Scholar 

  10. Grinold, R., Stanford, R.: Optimal control of a graded manpower system. Manage. Sci. 20(8), 1201–1216 (1974)

    Article  MathSciNet  MATH  Google Scholar 

  11. Heger, J., Voss, T.: Dynamically changing sequencing rules with reinforcement learning in a job shop system with stochastic influences. In: 2020 Winter Simulation Conference (WSC), pp. 1608–1618 (2020)

    Google Scholar 

  12. den Hengst, F., François-Lavet, V., Hoogendoorn, M., van Harmelen, F.: Planning for potential: efficient safe reinforcement learning. Mach. Learn. 111, 1–20 (2022)

    MathSciNet  MATH  Google Scholar 

  13. Jaillet, P., Loke, G.G., Sim, M.: Strategic workforce planning under uncertainty. Oper. Res. 70, 1042–1065 (2021)

    Article  MathSciNet  MATH  Google Scholar 

  14. Jnitova, V., Elsawah, S., Ryan, M.: Review of simulation models in military workforce planning and management context. J. Defense Model. Simul. 14(4), 447–463 (2017)

    Article  Google Scholar 

  15. Kant, J.D., Ballot, G., Goudet, O.: WorkSim: an agent-based model of labor markets. J. Artif. Soc. Soc. Simul. 23(4), 4 (2020)

    Article  Google Scholar 

  16. Rao, P.P.: A dynamic programming approach to determine optimal manpower recruitment policies. J. Oper. Res. Soc. 41(10), 983–988 (1990)

    Article  Google Scholar 

  17. Roijers, D.M., Vamplew, P., Whiteson, S., Dazeley, R.: A survey of multi-objective sequential decision-making. J. Artif. Intell. Res. 48, 67–113 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  18. Romer, P.: Human capital and growth: theory and evidence (1989)

    Google Scholar 

  19. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)

  20. Sing, C., Love, P., Tam, C.: Stock-flow model for forecasting labor supply. J. Constr. Eng. Manag. 138(6), 707–715 (2012)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Floris den Hengst .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Smit, Y., den Hengst, F., Bhulai, S., Mehdad, E. (2023). Strategic Workforce Planning with Deep Reinforcement Learning. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2022. Lecture Notes in Computer Science, vol 13811. Springer, Cham. https://doi.org/10.1007/978-3-031-25891-6_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-25891-6_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-25890-9

  • Online ISBN: 978-3-031-25891-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics