Strategic Workforce Planning with Deep Reinforcement Learning

Smit, Yannick; den Hengst, Floris; Bhulai, Sandjai; Mehdad, Ehsan

doi:10.1007/978-3-031-25891-6_9

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13811))

Included in the following conference series:

International Conference on Machine Learning, Optimization, and Data Science

927 Accesses
2 Altmetric

Abstract

This paper presents a simulation-optimization approach to strategic workforce planning based on deep reinforcement learning. A domain expert expresses the organization’s high-level, strategic workforce goals over the workforce composition. A policy that optimizes these goals is then learned in a simulation-optimization loop. Any suitable simulator can be used, and we describe how a simulator can be derived from historical data. The optimizer is driven by deep reinforcement learning and directly optimizes for the high-level strategic goals as a result. We compare the proposed approach with a linear programming-based approach on two types of workforce goals. The first type of goal, consisting of a target workforce, is relatively easy to optimize for but hard to specify in practice and is called operational in this work. The second, strategic, type of goal is a possibly non-linear combination of high-level workforce metrics. These goals can easily be specified by domain experts but may be hard to optimize for with existing approaches. The proposed approach performs significantly better on the strategic goal while performing comparably on the operational goal for both a synthetic and a real-world organization. Our novel approach based on deep reinforcement learning and simulation-optimization has a large potential for impact in the workforce planning domain. It directly optimizes for an organization’s workforce goals that may be non-linear in the workforce composition and composed of arbitrary workforce composition metrics.

Y. Smit and F. den Hengst—Authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The average number of direct reports of managers in the organization.
2.
A metric to express responsibilities and expectations of a role in the organization, usually associated with compensation in some way.
3.
For a model with \(n=30\) cohorts and \(X_{\text {max}}=100\) maximum employees per cohort, the number of transitions in the Markov chain is \(|\mathcal {S}\times \mathcal {S}| = \prod _{i=1}^{n}(S_{\text {max}}+1)^2 \approx 10^{120}\).
4.
Code and data for hypothetical use case available at https://github.com/ysmit933/swp-with-drl-release. Real-life use case data will be made available upon request.

References

April, J., Better, M., Glover, F.W., Kelly, J.P., Kochenberger, G.A.: Ensuring workforce readiness with optforce (2013). Unpublished manuscript retrieved from opttek.com
Banyai, T., Landschutzer, C., Banyai, A.: Markov-chain simulation-based analysis of human resource structure: how staff deployment and staffing affect sustainable human resource strategy. Sustainability 10(10), 3692 (2018)
Article Google Scholar
Bhulai, S., Koole, G., Pot, A.: Simple methods for shift scheduling in multiskill call centers. Manuf. Serv. Oper. Manage. 10(3), 411–420 (2008)
Article Google Scholar
Burke, E.K., De Causmaecker, P., Berghe, G.V., Van Landeghem, H.: The state of the art of nurse rostering. J. Sched. 7(6), 441–499 (2004)
Article MathSciNet MATH Google Scholar
Cotten, A.: Seven steps of effective workforce planning. IBM Center for the Business of Government (2007)
Google Scholar
Davis, M., Lu, Y., Sharma, M., Squillante, M., Zhang, B.: Stochastic optimization models for workforce planning, operations, and risk management. Serv. Sci. 10(1), 40–57 (2018)
Article Google Scholar
De Feyter, T., Guerry, M., et al.: Optimizing cost-effectiveness in a stochastic Markov manpower planning system under control by recruitment. Ann. Oper. Res. 253(1), 117–131 (2017)
Article MathSciNet MATH Google Scholar
Fei, Y., Yang, Z., Chen, Y., Wang, Z., Xie, Q.: Risk-sensitive reinforcement learning: Near-optimal risk-sample tradeoff in regret. Adv. Neural. Inf. Process. Syst. 33, 22384–22395 (2020)
Google Scholar
Gaimon, C., Thompson, G.: A distributed parameter cohort personnel planning model that uses cross-sectional data. Manage. Sci. 30(6), 750–764 (1984)
Article MATH Google Scholar
Grinold, R., Stanford, R.: Optimal control of a graded manpower system. Manage. Sci. 20(8), 1201–1216 (1974)
Article MathSciNet MATH Google Scholar
Heger, J., Voss, T.: Dynamically changing sequencing rules with reinforcement learning in a job shop system with stochastic influences. In: 2020 Winter Simulation Conference (WSC), pp. 1608–1618 (2020)
Google Scholar
den Hengst, F., François-Lavet, V., Hoogendoorn, M., van Harmelen, F.: Planning for potential: efficient safe reinforcement learning. Mach. Learn. 111, 1–20 (2022)
MathSciNet MATH Google Scholar
Jaillet, P., Loke, G.G., Sim, M.: Strategic workforce planning under uncertainty. Oper. Res. 70, 1042–1065 (2021)
Article MathSciNet MATH Google Scholar
Jnitova, V., Elsawah, S., Ryan, M.: Review of simulation models in military workforce planning and management context. J. Defense Model. Simul. 14(4), 447–463 (2017)
Article Google Scholar
Kant, J.D., Ballot, G., Goudet, O.: WorkSim: an agent-based model of labor markets. J. Artif. Soc. Soc. Simul. 23(4), 4 (2020)
Article Google Scholar
Rao, P.P.: A dynamic programming approach to determine optimal manpower recruitment policies. J. Oper. Res. Soc. 41(10), 983–988 (1990)
Article Google Scholar
Roijers, D.M., Vamplew, P., Whiteson, S., Dazeley, R.: A survey of multi-objective sequential decision-making. J. Artif. Intell. Res. 48, 67–113 (2013)
Article MathSciNet MATH Google Scholar
Romer, P.: Human capital and growth: theory and evidence (1989)
Google Scholar
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Sing, C., Love, P., Tam, C.: Stock-flow model for forecasting labor supply. J. Constr. Eng. Manag. 138(6), 707–715 (2012)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Universiteit van Amsterdam, Amsterdam, The Netherlands
Yannick Smit
Vrije Universiteit Amsterdam, Amsterdam, The Netherlands
Floris den Hengst & Sandjai Bhulai
ING Bank N.V., Amsterdam, The Netherlands
Floris den Hengst & Ehsan Mehdad

Authors

Yannick Smit
View author publications
You can also search for this author in PubMed Google Scholar
Floris den Hengst
View author publications
You can also search for this author in PubMed Google Scholar
Sandjai Bhulai
View author publications
You can also search for this author in PubMed Google Scholar
Ehsan Mehdad
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Floris den Hengst .

Editor information

Editors and Affiliations

University of Catania, Catania, Italy
Giuseppe Nicosia
University of Reading, Reading, UK
Varun Ojha
University of Oxford, Oxford, UK
Emanuele La Malfa
University of Cambridge, Cambridge, UK
Gabriele La Malfa
University of Florida, Gainesville, FL, USA
Panos Pardalos
Free University of Bozen-Bolzano, Bolzano, Italy
Giuseppe Di Fatta
University of Catania, Catania, Italy
Giovanni Giuffrida
Dana-Farber Cancer Institute, Boston, MA, USA
Renato Umeton

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Smit, Y., den Hengst, F., Bhulai, S., Mehdad, E. (2023). Strategic Workforce Planning with Deep Reinforcement Learning. In: Nicosia, G., et al. Machine Learning, Optimization, and Data Science. LOD 2022. Lecture Notes in Computer Science, vol 13811. Springer, Cham. https://doi.org/10.1007/978-3-031-25891-6_9

Download citation

DOI: https://doi.org/10.1007/978-3-031-25891-6_9
Published: 10 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25890-9
Online ISBN: 978-3-031-25891-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Strategic Workforce Planning with Deep Reinforcement Learning