Hybrid Data Science and Reinforcement Learning in Data Envelopment Analysis

Lee, Chia-Yen; Hung, Yu-Hsin; Chen, Yen-Wen

doi:10.1007/978-3-030-75162-3_4

Hybrid Data Science and Reinforcement Learning in Data Envelopment Analysis

Chia-Yen Lee⁶,
Yu-Hsin Hung⁶ &
Yen-Wen Chen⁷

Chapter
First Online: 17 December 2021

716 Accesses
1 Citations

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 312))

Abstract

This study proposes a hybrid data science (DS) framework and reinforcement learning (RL) in data envelopment analysis (DEA). The framework supports the functional form identification of the production frontier and the RL derives the optimal resource reallocation policy which guides the productivity improvement. In fact, both DS and RL techniques complement efficiency analysis. Emphasizes on planning over evaluation, we use data generating process (DGP) and an empirical dataset of power plants to drive productivity to validate the benefits of the hybrid DS framework and RL, respectively. Based on the results, we find that the hybrid DS framework and RL can enhance the interpretation of the production frontier and identify the optimal resource policy.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

1.
We only discuss the continuous independent variable and response variable; thus, the regression tree is used in CART.
2.
Scenario A is not included in Figures 10 and 11 with hybrid methods because the hypothesis testing of the residual analysis shows random noise.
3.
Coal consists of anthracite coal, bituminous coal, lignite coal, refined coal, coal-based synfuel, subbituminous coal, and waste/other coal. We sum all types and ignore coal quality. We do the same for oil.
4.
The blanks in Table 12 and the three blank states in Fig. 19 indicate missing data.

References

Adler, N., & Yazhemsky, E. (2010). Improving discrimination in data envelopment analysis: PCA-DEA or variable reduction. European Journal of Operational Research, 202, 273–284.
Article Google Scholar
Afriat, S. N. (1972). Efficiency estimation of production functions. International Economic Review, 13(3), 568–598.
Article Google Scholar
Banker, R. D., Chang, H., & Cooper, W. W. (1996). Equivalence and implementation of alternative methods for determining returns to scale in data envelopment analysis. European Journal of Operational Research, 89, 473–481.
Article Google Scholar
Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management Science, 30(9), 1078–1092.
Article Google Scholar
Banker, R. D., Cooper, W. W., Seiford, L. M., Thrall, R. M., & Zhu, J. (2004). Returns to scale in different DEA models. European Journal of Operational Research, 154, 345–362.
Article Google Scholar
Barr, R. S., & Durchholz, M. L. (1997). Parallel and hierarchical decomposition approaches for solving large-scale data envelopment analysis models. Annals of Operations Research, 73, 339–372.
Article Google Scholar
Benítez-Peñaa, S., Bogetoft, P., & Morales, D. R. (2020). Feature selection in data envelopment analysis: A mathematical optimization approach. Omega, 96, 102068.
Article Google Scholar
Blei, D. H., & Smyth, P. (2017). Science and data science. PNAS, 114(33), 8689–8692.
Article Google Scholar
Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.
Article Google Scholar
Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. CRC press.
Google Scholar
Charles, V., Aparicio, J., & Zhu, J. (2019). The curse of dimensionality of decision-making units: A simple approach to increase the discriminatory power of data envelopment analysis. European Journal of Operational Research, 279, 929–940.
Article Google Scholar
Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European Journal of Operational Research, 2(6), 429–444.
Article Google Scholar
Chen, C.-M. (2009). A network-DEA model with new efficiency measures to incorporate the dynamic effect in production networks. European Journal of Operational Research, 194, 687–699.
Article Google Scholar
Chen, C.-M., & van Dalen, J. (2010). Measuring dynamic efficiency: Theories and an integrated methodology. European Journal of Operational Research, 203, 749–760.
Article Google Scholar
Chen, W.-C., & Cho, W.-J. (2009). A procedure for large-scale DEA computations. Computers & Operations Research, 36(6), 1813–1824.
Article Google Scholar
Chen, W.-C., & Lai, S.-Y. (2017). Determining radial efficiency with a large data set by solving small-size linear programs. Annals of Operations Research, 250(1), 147–166.
Article Google Scholar
Chung, Y. H., Färe, R., & Grosskopf, S. (1997). Productivity and undesirable outputs: A directional distance function approach. Journal of Environmental Management, 51(3), 229–240.
Article Google Scholar
Chunming, L., Xin, X., & Dewen, H. (2015). Multiobjective reinforcement learning: A comprehensive overview. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 45(3), 385–398.
Article Google Scholar
Coelli, T. J., Rao, D. S. P., O’Donnell, C. J., & Battese, G. (2005). An introduction to efficiency and productivity analysis (2nd ed.). Springer.
Google Scholar
Cook, W. D., Tone, K., & Zhu, J. (2014). Data envelopment analysis: Prior to choosing a model. Omega, 44, 1–4.
Article Google Scholar
Davenport, T. H., & Patil, D. J. (2012). Data scientist: The sexiest job of the 21st century. Harvard Business Review, 90(10), 70–76.
Google Scholar
Donoho, D. (2017). 50 years of data science. Journal of Computational and Graphical Statistics, 26(4), 745–766.
Article Google Scholar
Dulá, J. H. (2011). A method for data envelopment analysis. INFORMS Journal on Computing, 23(2), 284–296.
Article Google Scholar
EIA (U.S. Energy Information Administration). (2020a). Form EIA-860 detailed data with previous form data (EIA-860A/860B). Available at https://www.eia.gov/electricity/data/eia860/ (Accessed 26 July 2020).
EIA (U.S. Energy Information Administration). (2020b). Form EIA-923 detailed data with previous form data (EIA-906/920). Available at https://www.eia.gov/electricity/data/eia923/ (Accessed 26 July 2020).
Fried, H. O., Lovell, C. A. K., & Schmidt, S. S. (2008). The measurement of productive efficiency and productivity growth. Oxford University Press.
Book Google Scholar
Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67.
Google Scholar
Johnson, A. L., & Kuosmanen, T. (2011). One-stage estimation of the effects of operational conditions and practices on productive performance: Asymptotically normal and efficient, root-N consistent StoNEZD method. Journal of Productivity Analysis, 36(2), 219–230.
Article Google Scholar
Johnson, A. L., & Kuosmanen, T. (2012). One-stage and two-stage DEA estimation of the effects of contextual variables. European Journal of Operational Research, 220(2), 559–570.
Article Google Scholar
Johnson, A. L., & Lee, C.-Y., (2017). Predictive efficiency analysis. Book chapter edited in: Kaoru Tone, (Editor), Advances in DEA theory and applications: with extensions to forecasting models, (pp. 404–418), John Wiley & Sons Ltd.
Google Scholar
Kao, C. (2014). Network data envelopment analysis: A review. European Journal of Operational Research, 239(1), 1–16.
Article Google Scholar
Kao, C., & Hwang, S. (2008). Efficiency decomposition in two-stage data envelopment analysis: An application to non-life insurance companies in Taiwan. European Journal of Operational Research, 185(1), 418–429.
Article Google Scholar
Khezrimotlagh, D., Zhu, J., Cook, W. D., & Toloo, M. (2019). Data envelopment analysis and big data. European Journal of Operational Research, 274, 1047–1054.
Article Google Scholar
Koza, J. R. (1994). Genetic programming as a means for programming computers by natural selection. Statistics and Computing, 4, 87–112.
Article Google Scholar
Kuosmanen, T., & Johnson, A. L. (2010). Data envelopment analysis as nonparametric least-squares regression. Operations Research, 58(1), 149–160.
Article Google Scholar
Lee, C.-Y. (2014). Meta-data envelopment analysis: Finding a direction towards marginal profit maximization. European Journal of Operational Research, 237(1), 207–216.
Article Google Scholar
Lee, C.-Y. (2015). Distinguishing operational performance in power production: A new measure of effectiveness by DEA. IEEE Transactions on Power Systems, 30(6), 3160–3167.
Article Google Scholar
Lee, C.-Y. (2016). Nash-profit efficiency: A measure of changes in market structures. European Journal of Operational Research, 255(2), 659–663.
Article Google Scholar
Lee, C.-Y. (2017). Directional marginal productivity: A foundation of meta-data envelopment analysis. Journal of the Operational Research Society, 68(5), 544–555.
Article Google Scholar
Lee, C.-Y. (2018). Mixed-strategy Nash equilibrium in data envelopment analysis. European Journal of Operational Research, 266(3), 1013–1024.
Article Google Scholar
Lee, C.-Y., & Cai, J.-Y. (2020). LASSO variable selection in data envelopment analysis with small datasets. Omega: The International Journal of Management Science, 91, 102019.
Article Google Scholar
Lee, C.-Y., & Chen, B.-S. (2018). Mutually-exclusive-and-collectively-exhaustive feature selection scheme. Applied Soft Computing, 68, 961–971.
Article Google Scholar
Lee, C.-Y., & Chien, C.-F. (2020). Pitfalls and protocols of data science in manufacturing practice. Journal of Intelligent Manufacturing. https://doi.org/10.1007/s10845-020-01711-w
Lee, C.-Y., & Johnson, A. L. (2011). A decomposition of productivity change in the semiconductor manufacturing industry. International Journal of Production Research, 49(16), 4761–4785.
Article Google Scholar
Lee, C.-Y., & Johnson, A. L. (2012). Two-dimensional efficiency decomposition to measure the demand effect in productivity analysis. European Journal of Operational Research, 216(3), 584–593.
Article Google Scholar
Lee, C.-Y., & Johnson, A. L. (2014). Proactive data envelopment analysis: Effective production and capacity expansion in stochastic environments. European Journal of Operational Research, 232(3), 537–548.
Article Google Scholar
Lee, C.-Y., & Johnson, A. L. (2015a). Measuring efficiency in imperfectly competitive markets: An example of rational inefficiency. Journal of Optimization Theory and Applications, 164(2), 702–722.
Article Google Scholar
Lee, C.-Y., & Johnson, A. L. (2015b). Effective production: Measuring of the sales effect using data envelopment analysis. Annals of Operations Research, 235(1), 453–486.
Article Google Scholar
Lee, C.-Y., & Zhou, P. (2015). Directional shadow price estimation of CO2, SO2 and NOx in the United States coal power industry 1990-2010. Energy Economics, 51, 493–502.
Article Google Scholar
Liang, L., Wu, J., Cook, W. D., & Zhu, J. (2008). The DEA game cross-efficiency model and its Nash equilibrium. Operations Research, 56(5), 1278–1288.
Article Google Scholar
Liaw, A., & Wiener, M. (2002). Classification and regression by randomforest. R News, 2(3), 18–22.
Google Scholar
McDonald, J. (2009). Using least squares and tobit in second stage DEA efficiency analyses. European Journal of Operational Research, 197(2), 792–798.
Article Google Scholar
McKay, B., Willis, M. J., & Barton, G. W., (1995). Using a tree structured genetic algorithm to perform symbolic regression. First international conference on genetic algorithms in engineering systems: Innovations and applications. Sheffield, UK.
Google Scholar
Puterman, M. L. (2005). Markov decision processes: Discrete stochastic dynamic programming (2nd ed.). John Wiley & Sons.
Google Scholar
Seiford, L. M., & Zhu, J. (1999). An investigation of returns to scale in data envelopment analysis. Omega, 27, 1–11.
Article Google Scholar
Simar, L., & Wilson, P. W. (2007). Estimation and inference in two-stage, semi-parametric models of production processes. Journal of Econometrics, 136(1), 31–64.
Article Google Scholar
Sueyoshi, T. (1999). DEA-discriminant analysis in the view of goal programming. European Journal of Operational Research, 115, 564–582.
Article Google Scholar
Tone, K., & Tsutsui, M. (2014). Dynamic DEA with network structure: A slacks-based measure approach. Omega, 42, 124–131.
Article Google Scholar
Walpole, R. E., Myers, R. H., Myers, S. L., & Ye, K. E., (2012). Probability and Statistics for Engineers and Scientists, 9 ed.: Pearson.
Google Scholar
Wang, K., Xian, Y., Lee, C.-Y., Wei, Y.-M., & Huang, Z. (2019). On selecting directions for directional distance functions in a non-parametric framework: A review. Annals of Operations Research, 278(1–2), 43–76.
Article Google Scholar
Yu, M. M., & Lin, E. T. J. (2008). Efficiency and effectiveness in railway performance using a multi-activity network DEA model. Omega, 36(6), 1005–1017.
Article Google Scholar
Zhu, J. (2020). DEA under big data: Data enabled analytics and network data envelopment analysis. Annals of Operations Research. https://doi.org/10.1007/s10479-020-03668-8
Zofio, J. L., Paster, J. T., & Aparicio, J. (2013). The directional profit efficiency measure: On why profit efficiency is either technical or allocative. Journal of Productivity Analysis, 40(3), 257–266.
Article Google Scholar

Download references

Acknowledgments

This research was funded by Ministry of Science and Technology (MOST108-2221-E-006 -223 -MY3), Taiwan.

Author information

Authors and Affiliations

Department of Information Management, National Taiwan University, Taipei, Taiwan
Chia-Yen Lee & Yu-Hsin Hung
Institute of Manufacturing Information and Systems, National Cheng Kung University, Tainan City, Taiwan
Yen-Wen Chen

Authors

Chia-Yen Lee
View author publications
You can also search for this author in PubMed Google Scholar
Yu-Hsin Hung
View author publications
You can also search for this author in PubMed Google Scholar
Yen-Wen Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Chia-Yen Lee .

Editor information

Editors and Affiliations

Foisie Business School, Worcester Polytechnic Institute, Worcester, MA, USA
Joe Zhu
Buckingham Business School, University of Buckingham, Birmingham, UK
Vincent Charles

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lee, CY., Hung, YH., Chen, YW. (2021). Hybrid Data Science and Reinforcement Learning in Data Envelopment Analysis. In: Zhu, J., Charles, V. (eds) Data-Enabled Analytics. International Series in Operations Research & Management Science, vol 312. Springer, Cham. https://doi.org/10.1007/978-3-030-75162-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-030-75162-3_4
Published: 17 December 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-75161-6
Online ISBN: 978-3-030-75162-3
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics