Abstract
This study proposes a hybrid data science (DS) framework and reinforcement learning (RL) in data envelopment analysis (DEA). The framework supports the functional form identification of the production frontier and the RL derives the optimal resource reallocation policy which guides the productivity improvement. In fact, both DS and RL techniques complement efficiency analysis. Emphasizes on planning over evaluation, we use data generating process (DGP) and an empirical dataset of power plants to drive productivity to validate the benefits of the hybrid DS framework and RL, respectively. Based on the results, we find that the hybrid DS framework and RL can enhance the interpretation of the production frontier and identify the optimal resource policy.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
We only discuss the continuous independent variable and response variable; thus, the regression tree is used in CART.
- 2.
- 3.
Coal consists of anthracite coal, bituminous coal, lignite coal, refined coal, coal-based synfuel, subbituminous coal, and waste/other coal. We sum all types and ignore coal quality. We do the same for oil.
- 4.
References
Adler, N., & Yazhemsky, E. (2010). Improving discrimination in data envelopment analysis: PCA-DEA or variable reduction. European Journal of Operational Research, 202, 273–284.
Afriat, S. N. (1972). Efficiency estimation of production functions. International Economic Review, 13(3), 568–598.
Banker, R. D., Chang, H., & Cooper, W. W. (1996). Equivalence and implementation of alternative methods for determining returns to scale in data envelopment analysis. European Journal of Operational Research, 89, 473–481.
Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management Science, 30(9), 1078–1092.
Banker, R. D., Cooper, W. W., Seiford, L. M., Thrall, R. M., & Zhu, J. (2004). Returns to scale in different DEA models. European Journal of Operational Research, 154, 345–362.
Barr, R. S., & Durchholz, M. L. (1997). Parallel and hierarchical decomposition approaches for solving large-scale data envelopment analysis models. Annals of Operations Research, 73, 339–372.
Benítez-Peñaa, S., Bogetoft, P., & Morales, D. R. (2020). Feature selection in data envelopment analysis: A mathematical optimization approach. Omega, 96, 102068.
Blei, D. H., & Smyth, P. (2017). Science and data science. PNAS, 114(33), 8689–8692.
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. CRC press.
Charles, V., Aparicio, J., & Zhu, J. (2019). The curse of dimensionality of decision-making units: A simple approach to increase the discriminatory power of data envelopment analysis. European Journal of Operational Research, 279, 929–940.
Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European Journal of Operational Research, 2(6), 429–444.
Chen, C.-M. (2009). A network-DEA model with new efficiency measures to incorporate the dynamic effect in production networks. European Journal of Operational Research, 194, 687–699.
Chen, C.-M., & van Dalen, J. (2010). Measuring dynamic efficiency: Theories and an integrated methodology. European Journal of Operational Research, 203, 749–760.
Chen, W.-C., & Cho, W.-J. (2009). A procedure for large-scale DEA computations. Computers & Operations Research, 36(6), 1813–1824.
Chen, W.-C., & Lai, S.-Y. (2017). Determining radial efficiency with a large data set by solving small-size linear programs. Annals of Operations Research, 250(1), 147–166.
Chung, Y. H., Färe, R., & Grosskopf, S. (1997). Productivity and undesirable outputs: A directional distance function approach. Journal of Environmental Management, 51(3), 229–240.
Chunming, L., Xin, X., & Dewen, H. (2015). Multiobjective reinforcement learning: A comprehensive overview. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 45(3), 385–398.
Coelli, T. J., Rao, D. S. P., O’Donnell, C. J., & Battese, G. (2005). An introduction to efficiency and productivity analysis (2nd ed.). Springer.
Cook, W. D., Tone, K., & Zhu, J. (2014). Data envelopment analysis: Prior to choosing a model. Omega, 44, 1–4.
Davenport, T. H., & Patil, D. J. (2012). Data scientist: The sexiest job of the 21st century. Harvard Business Review, 90(10), 70–76.
Donoho, D. (2017). 50 years of data science. Journal of Computational and Graphical Statistics, 26(4), 745–766.
Dulá, J. H. (2011). A method for data envelopment analysis. INFORMS Journal on Computing, 23(2), 284–296.
EIA (U.S. Energy Information Administration). (2020a). Form EIA-860 detailed data with previous form data (EIA-860A/860B). Available at https://www.eia.gov/electricity/data/eia860/ (Accessed 26 July 2020).
EIA (U.S. Energy Information Administration). (2020b). Form EIA-923 detailed data with previous form data (EIA-906/920). Available at https://www.eia.gov/electricity/data/eia923/ (Accessed 26 July 2020).
Fried, H. O., Lovell, C. A. K., & Schmidt, S. S. (2008). The measurement of productive efficiency and productivity growth. Oxford University Press.
Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67.
Johnson, A. L., & Kuosmanen, T. (2011). One-stage estimation of the effects of operational conditions and practices on productive performance: Asymptotically normal and efficient, root-N consistent StoNEZD method. Journal of Productivity Analysis, 36(2), 219–230.
Johnson, A. L., & Kuosmanen, T. (2012). One-stage and two-stage DEA estimation of the effects of contextual variables. European Journal of Operational Research, 220(2), 559–570.
Johnson, A. L., & Lee, C.-Y., (2017). Predictive efficiency analysis. Book chapter edited in: Kaoru Tone, (Editor), Advances in DEA theory and applications: with extensions to forecasting models, (pp. 404–418), John Wiley & Sons Ltd.
Kao, C. (2014). Network data envelopment analysis: A review. European Journal of Operational Research, 239(1), 1–16.
Kao, C., & Hwang, S. (2008). Efficiency decomposition in two-stage data envelopment analysis: An application to non-life insurance companies in Taiwan. European Journal of Operational Research, 185(1), 418–429.
Khezrimotlagh, D., Zhu, J., Cook, W. D., & Toloo, M. (2019). Data envelopment analysis and big data. European Journal of Operational Research, 274, 1047–1054.
Koza, J. R. (1994). Genetic programming as a means for programming computers by natural selection. Statistics and Computing, 4, 87–112.
Kuosmanen, T., & Johnson, A. L. (2010). Data envelopment analysis as nonparametric least-squares regression. Operations Research, 58(1), 149–160.
Lee, C.-Y. (2014). Meta-data envelopment analysis: Finding a direction towards marginal profit maximization. European Journal of Operational Research, 237(1), 207–216.
Lee, C.-Y. (2015). Distinguishing operational performance in power production: A new measure of effectiveness by DEA. IEEE Transactions on Power Systems, 30(6), 3160–3167.
Lee, C.-Y. (2016). Nash-profit efficiency: A measure of changes in market structures. European Journal of Operational Research, 255(2), 659–663.
Lee, C.-Y. (2017). Directional marginal productivity: A foundation of meta-data envelopment analysis. Journal of the Operational Research Society, 68(5), 544–555.
Lee, C.-Y. (2018). Mixed-strategy Nash equilibrium in data envelopment analysis. European Journal of Operational Research, 266(3), 1013–1024.
Lee, C.-Y., & Cai, J.-Y. (2020). LASSO variable selection in data envelopment analysis with small datasets. Omega: The International Journal of Management Science, 91, 102019.
Lee, C.-Y., & Chen, B.-S. (2018). Mutually-exclusive-and-collectively-exhaustive feature selection scheme. Applied Soft Computing, 68, 961–971.
Lee, C.-Y., & Chien, C.-F. (2020). Pitfalls and protocols of data science in manufacturing practice. Journal of Intelligent Manufacturing. https://doi.org/10.1007/s10845-020-01711-w
Lee, C.-Y., & Johnson, A. L. (2011). A decomposition of productivity change in the semiconductor manufacturing industry. International Journal of Production Research, 49(16), 4761–4785.
Lee, C.-Y., & Johnson, A. L. (2012). Two-dimensional efficiency decomposition to measure the demand effect in productivity analysis. European Journal of Operational Research, 216(3), 584–593.
Lee, C.-Y., & Johnson, A. L. (2014). Proactive data envelopment analysis: Effective production and capacity expansion in stochastic environments. European Journal of Operational Research, 232(3), 537–548.
Lee, C.-Y., & Johnson, A. L. (2015a). Measuring efficiency in imperfectly competitive markets: An example of rational inefficiency. Journal of Optimization Theory and Applications, 164(2), 702–722.
Lee, C.-Y., & Johnson, A. L. (2015b). Effective production: Measuring of the sales effect using data envelopment analysis. Annals of Operations Research, 235(1), 453–486.
Lee, C.-Y., & Zhou, P. (2015). Directional shadow price estimation of CO2, SO2 and NOx in the United States coal power industry 1990-2010. Energy Economics, 51, 493–502.
Liang, L., Wu, J., Cook, W. D., & Zhu, J. (2008). The DEA game cross-efficiency model and its Nash equilibrium. Operations Research, 56(5), 1278–1288.
Liaw, A., & Wiener, M. (2002). Classification and regression by randomforest. R News, 2(3), 18–22.
McDonald, J. (2009). Using least squares and tobit in second stage DEA efficiency analyses. European Journal of Operational Research, 197(2), 792–798.
McKay, B., Willis, M. J., & Barton, G. W., (1995). Using a tree structured genetic algorithm to perform symbolic regression. First international conference on genetic algorithms in engineering systems: Innovations and applications. Sheffield, UK.
Puterman, M. L. (2005). Markov decision processes: Discrete stochastic dynamic programming (2nd ed.). John Wiley & Sons.
Seiford, L. M., & Zhu, J. (1999). An investigation of returns to scale in data envelopment analysis. Omega, 27, 1–11.
Simar, L., & Wilson, P. W. (2007). Estimation and inference in two-stage, semi-parametric models of production processes. Journal of Econometrics, 136(1), 31–64.
Sueyoshi, T. (1999). DEA-discriminant analysis in the view of goal programming. European Journal of Operational Research, 115, 564–582.
Tone, K., & Tsutsui, M. (2014). Dynamic DEA with network structure: A slacks-based measure approach. Omega, 42, 124–131.
Walpole, R. E., Myers, R. H., Myers, S. L., & Ye, K. E., (2012). Probability and Statistics for Engineers and Scientists, 9 ed.: Pearson.
Wang, K., Xian, Y., Lee, C.-Y., Wei, Y.-M., & Huang, Z. (2019). On selecting directions for directional distance functions in a non-parametric framework: A review. Annals of Operations Research, 278(1–2), 43–76.
Yu, M. M., & Lin, E. T. J. (2008). Efficiency and effectiveness in railway performance using a multi-activity network DEA model. Omega, 36(6), 1005–1017.
Zhu, J. (2020). DEA under big data: Data enabled analytics and network data envelopment analysis. Annals of Operations Research. https://doi.org/10.1007/s10479-020-03668-8
Zofio, J. L., Paster, J. T., & Aparicio, J. (2013). The directional profit efficiency measure: On why profit efficiency is either technical or allocative. Journal of Productivity Analysis, 40(3), 257–266.
Acknowledgments
This research was funded by Ministry of Science and Technology (MOST108-2221-E-006 -223 -MY3), Taiwan.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Lee, CY., Hung, YH., Chen, YW. (2021). Hybrid Data Science and Reinforcement Learning in Data Envelopment Analysis. In: Zhu, J., Charles, V. (eds) Data-Enabled Analytics. International Series in Operations Research & Management Science, vol 312. Springer, Cham. https://doi.org/10.1007/978-3-030-75162-3_4
Download citation
DOI: https://doi.org/10.1007/978-3-030-75162-3_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75161-6
Online ISBN: 978-3-030-75162-3
eBook Packages: Business and ManagementBusiness and Management (R0)