Skip to main content

Hybrid Data Science and Reinforcement Learning in Data Envelopment Analysis

  • Chapter
  • First Online:

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 312))

Abstract

This study proposes a hybrid data science (DS) framework and reinforcement learning (RL) in data envelopment analysis (DEA). The framework supports the functional form identification of the production frontier and the RL derives the optimal resource reallocation policy which guides the productivity improvement. In fact, both DS and RL techniques complement efficiency analysis. Emphasizes on planning over evaluation, we use data generating process (DGP) and an empirical dataset of power plants to drive productivity to validate the benefits of the hybrid DS framework and RL, respectively. Based on the results, we find that the hybrid DS framework and RL can enhance the interpretation of the production frontier and identify the optimal resource policy.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    We only discuss the continuous independent variable and response variable; thus, the regression tree is used in CART.

  2. 2.

    Scenario A is not included in Figures 10 and 11 with hybrid methods because the hypothesis testing of the residual analysis shows random noise.

  3. 3.

    Coal consists of anthracite coal, bituminous coal, lignite coal, refined coal, coal-based synfuel, subbituminous coal, and waste/other coal. We sum all types and ignore coal quality. We do the same for oil.

  4. 4.

    The blanks in Table 12 and the three blank states in Fig. 19 indicate missing data.

References

  • Adler, N., & Yazhemsky, E. (2010). Improving discrimination in data envelopment analysis: PCA-DEA or variable reduction. European Journal of Operational Research, 202, 273–284.

    Article  Google Scholar 

  • Afriat, S. N. (1972). Efficiency estimation of production functions. International Economic Review, 13(3), 568–598.

    Article  Google Scholar 

  • Banker, R. D., Chang, H., & Cooper, W. W. (1996). Equivalence and implementation of alternative methods for determining returns to scale in data envelopment analysis. European Journal of Operational Research, 89, 473–481.

    Article  Google Scholar 

  • Banker, R. D., Charnes, A., & Cooper, W. W. (1984). Some models for estimating technical and scale inefficiencies in data envelopment analysis. Management Science, 30(9), 1078–1092.

    Article  Google Scholar 

  • Banker, R. D., Cooper, W. W., Seiford, L. M., Thrall, R. M., & Zhu, J. (2004). Returns to scale in different DEA models. European Journal of Operational Research, 154, 345–362.

    Article  Google Scholar 

  • Barr, R. S., & Durchholz, M. L. (1997). Parallel and hierarchical decomposition approaches for solving large-scale data envelopment analysis models. Annals of Operations Research, 73, 339–372.

    Article  Google Scholar 

  • Benítez-Peñaa, S., Bogetoft, P., & Morales, D. R. (2020). Feature selection in data envelopment analysis: A mathematical optimization approach. Omega, 96, 102068.

    Article  Google Scholar 

  • Blei, D. H., & Smyth, P. (2017). Science and data science. PNAS, 114(33), 8689–8692.

    Article  Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

    Article  Google Scholar 

  • Breiman, L., Friedman, J., Stone, C. J., & Olshen, R. A. (1984). Classification and regression trees. CRC press.

    Google Scholar 

  • Charles, V., Aparicio, J., & Zhu, J. (2019). The curse of dimensionality of decision-making units: A simple approach to increase the discriminatory power of data envelopment analysis. European Journal of Operational Research, 279, 929–940.

    Article  Google Scholar 

  • Charnes, A., Cooper, W. W., & Rhodes, E. (1978). Measuring the efficiency of decision making units. European Journal of Operational Research, 2(6), 429–444.

    Article  Google Scholar 

  • Chen, C.-M. (2009). A network-DEA model with new efficiency measures to incorporate the dynamic effect in production networks. European Journal of Operational Research, 194, 687–699.

    Article  Google Scholar 

  • Chen, C.-M., & van Dalen, J. (2010). Measuring dynamic efficiency: Theories and an integrated methodology. European Journal of Operational Research, 203, 749–760.

    Article  Google Scholar 

  • Chen, W.-C., & Cho, W.-J. (2009). A procedure for large-scale DEA computations. Computers & Operations Research, 36(6), 1813–1824.

    Article  Google Scholar 

  • Chen, W.-C., & Lai, S.-Y. (2017). Determining radial efficiency with a large data set by solving small-size linear programs. Annals of Operations Research, 250(1), 147–166.

    Article  Google Scholar 

  • Chung, Y. H., Färe, R., & Grosskopf, S. (1997). Productivity and undesirable outputs: A directional distance function approach. Journal of Environmental Management, 51(3), 229–240.

    Article  Google Scholar 

  • Chunming, L., Xin, X., & Dewen, H. (2015). Multiobjective reinforcement learning: A comprehensive overview. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 45(3), 385–398.

    Article  Google Scholar 

  • Coelli, T. J., Rao, D. S. P., O’Donnell, C. J., & Battese, G. (2005). An introduction to efficiency and productivity analysis (2nd ed.). Springer.

    Google Scholar 

  • Cook, W. D., Tone, K., & Zhu, J. (2014). Data envelopment analysis: Prior to choosing a model. Omega, 44, 1–4.

    Article  Google Scholar 

  • Davenport, T. H., & Patil, D. J. (2012). Data scientist: The sexiest job of the 21st century. Harvard Business Review, 90(10), 70–76.

    Google Scholar 

  • Donoho, D. (2017). 50 years of data science. Journal of Computational and Graphical Statistics, 26(4), 745–766.

    Article  Google Scholar 

  • Dulá, J. H. (2011). A method for data envelopment analysis. INFORMS Journal on Computing, 23(2), 284–296.

    Article  Google Scholar 

  • EIA (U.S. Energy Information Administration). (2020a). Form EIA-860 detailed data with previous form data (EIA-860A/860B). Available at https://www.eia.gov/electricity/data/eia860/ (Accessed 26 July 2020).

  • EIA (U.S. Energy Information Administration). (2020b). Form EIA-923 detailed data with previous form data (EIA-906/920). Available at https://www.eia.gov/electricity/data/eia923/ (Accessed 26 July 2020).

  • Fried, H. O., Lovell, C. A. K., & Schmidt, S. S. (2008). The measurement of productive efficiency and productivity growth. Oxford University Press.

    Book  Google Scholar 

  • Friedman, J. H. (1991). Multivariate adaptive regression splines. The Annals of Statistics, 19(1), 1–67.

    Google Scholar 

  • Johnson, A. L., & Kuosmanen, T. (2011). One-stage estimation of the effects of operational conditions and practices on productive performance: Asymptotically normal and efficient, root-N consistent StoNEZD method. Journal of Productivity Analysis, 36(2), 219–230.

    Article  Google Scholar 

  • Johnson, A. L., & Kuosmanen, T. (2012). One-stage and two-stage DEA estimation of the effects of contextual variables. European Journal of Operational Research, 220(2), 559–570.

    Article  Google Scholar 

  • Johnson, A. L., & Lee, C.-Y., (2017). Predictive efficiency analysis. Book chapter edited in: Kaoru Tone, (Editor), Advances in DEA theory and applications: with extensions to forecasting models, (pp. 404–418), John Wiley & Sons Ltd.

    Google Scholar 

  • Kao, C. (2014). Network data envelopment analysis: A review. European Journal of Operational Research, 239(1), 1–16.

    Article  Google Scholar 

  • Kao, C., & Hwang, S. (2008). Efficiency decomposition in two-stage data envelopment analysis: An application to non-life insurance companies in Taiwan. European Journal of Operational Research, 185(1), 418–429.

    Article  Google Scholar 

  • Khezrimotlagh, D., Zhu, J., Cook, W. D., & Toloo, M. (2019). Data envelopment analysis and big data. European Journal of Operational Research, 274, 1047–1054.

    Article  Google Scholar 

  • Koza, J. R. (1994). Genetic programming as a means for programming computers by natural selection. Statistics and Computing, 4, 87–112.

    Article  Google Scholar 

  • Kuosmanen, T., & Johnson, A. L. (2010). Data envelopment analysis as nonparametric least-squares regression. Operations Research, 58(1), 149–160.

    Article  Google Scholar 

  • Lee, C.-Y. (2014). Meta-data envelopment analysis: Finding a direction towards marginal profit maximization. European Journal of Operational Research, 237(1), 207–216.

    Article  Google Scholar 

  • Lee, C.-Y. (2015). Distinguishing operational performance in power production: A new measure of effectiveness by DEA. IEEE Transactions on Power Systems, 30(6), 3160–3167.

    Article  Google Scholar 

  • Lee, C.-Y. (2016). Nash-profit efficiency: A measure of changes in market structures. European Journal of Operational Research, 255(2), 659–663.

    Article  Google Scholar 

  • Lee, C.-Y. (2017). Directional marginal productivity: A foundation of meta-data envelopment analysis. Journal of the Operational Research Society, 68(5), 544–555.

    Article  Google Scholar 

  • Lee, C.-Y. (2018). Mixed-strategy Nash equilibrium in data envelopment analysis. European Journal of Operational Research, 266(3), 1013–1024.

    Article  Google Scholar 

  • Lee, C.-Y., & Cai, J.-Y. (2020). LASSO variable selection in data envelopment analysis with small datasets. Omega: The International Journal of Management Science, 91, 102019.

    Article  Google Scholar 

  • Lee, C.-Y., & Chen, B.-S. (2018). Mutually-exclusive-and-collectively-exhaustive feature selection scheme. Applied Soft Computing, 68, 961–971.

    Article  Google Scholar 

  • Lee, C.-Y., & Chien, C.-F. (2020). Pitfalls and protocols of data science in manufacturing practice. Journal of Intelligent Manufacturing. https://doi.org/10.1007/s10845-020-01711-w

  • Lee, C.-Y., & Johnson, A. L. (2011). A decomposition of productivity change in the semiconductor manufacturing industry. International Journal of Production Research, 49(16), 4761–4785.

    Article  Google Scholar 

  • Lee, C.-Y., & Johnson, A. L. (2012). Two-dimensional efficiency decomposition to measure the demand effect in productivity analysis. European Journal of Operational Research, 216(3), 584–593.

    Article  Google Scholar 

  • Lee, C.-Y., & Johnson, A. L. (2014). Proactive data envelopment analysis: Effective production and capacity expansion in stochastic environments. European Journal of Operational Research, 232(3), 537–548.

    Article  Google Scholar 

  • Lee, C.-Y., & Johnson, A. L. (2015a). Measuring efficiency in imperfectly competitive markets: An example of rational inefficiency. Journal of Optimization Theory and Applications, 164(2), 702–722.

    Article  Google Scholar 

  • Lee, C.-Y., & Johnson, A. L. (2015b). Effective production: Measuring of the sales effect using data envelopment analysis. Annals of Operations Research, 235(1), 453–486.

    Article  Google Scholar 

  • Lee, C.-Y., & Zhou, P. (2015). Directional shadow price estimation of CO2, SO2 and NOx in the United States coal power industry 1990-2010. Energy Economics, 51, 493–502.

    Article  Google Scholar 

  • Liang, L., Wu, J., Cook, W. D., & Zhu, J. (2008). The DEA game cross-efficiency model and its Nash equilibrium. Operations Research, 56(5), 1278–1288.

    Article  Google Scholar 

  • Liaw, A., & Wiener, M. (2002). Classification and regression by randomforest. R News, 2(3), 18–22.

    Google Scholar 

  • McDonald, J. (2009). Using least squares and tobit in second stage DEA efficiency analyses. European Journal of Operational Research, 197(2), 792–798.

    Article  Google Scholar 

  • McKay, B., Willis, M. J., & Barton, G. W., (1995). Using a tree structured genetic algorithm to perform symbolic regression. First international conference on genetic algorithms in engineering systems: Innovations and applications. Sheffield, UK.

    Google Scholar 

  • Puterman, M. L. (2005). Markov decision processes: Discrete stochastic dynamic programming (2nd ed.). John Wiley & Sons.

    Google Scholar 

  • Seiford, L. M., & Zhu, J. (1999). An investigation of returns to scale in data envelopment analysis. Omega, 27, 1–11.

    Article  Google Scholar 

  • Simar, L., & Wilson, P. W. (2007). Estimation and inference in two-stage, semi-parametric models of production processes. Journal of Econometrics, 136(1), 31–64.

    Article  Google Scholar 

  • Sueyoshi, T. (1999). DEA-discriminant analysis in the view of goal programming. European Journal of Operational Research, 115, 564–582.

    Article  Google Scholar 

  • Tone, K., & Tsutsui, M. (2014). Dynamic DEA with network structure: A slacks-based measure approach. Omega, 42, 124–131.

    Article  Google Scholar 

  • Walpole, R. E., Myers, R. H., Myers, S. L., & Ye, K. E., (2012). Probability and Statistics for Engineers and Scientists, 9 ed.: Pearson.

    Google Scholar 

  • Wang, K., Xian, Y., Lee, C.-Y., Wei, Y.-M., & Huang, Z. (2019). On selecting directions for directional distance functions in a non-parametric framework: A review. Annals of Operations Research, 278(1–2), 43–76.

    Article  Google Scholar 

  • Yu, M. M., & Lin, E. T. J. (2008). Efficiency and effectiveness in railway performance using a multi-activity network DEA model. Omega, 36(6), 1005–1017.

    Article  Google Scholar 

  • Zhu, J. (2020). DEA under big data: Data enabled analytics and network data envelopment analysis. Annals of Operations Research. https://doi.org/10.1007/s10479-020-03668-8

  • Zofio, J. L., Paster, J. T., & Aparicio, J. (2013). The directional profit efficiency measure: On why profit efficiency is either technical or allocative. Journal of Productivity Analysis, 40(3), 257–266.

    Article  Google Scholar 

Download references

Acknowledgments

This research was funded by Ministry of Science and Technology (MOST108-2221-E-006 -223 -MY3), Taiwan.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Chia-Yen Lee .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Lee, CY., Hung, YH., Chen, YW. (2021). Hybrid Data Science and Reinforcement Learning in Data Envelopment Analysis. In: Zhu, J., Charles, V. (eds) Data-Enabled Analytics. International Series in Operations Research & Management Science, vol 312. Springer, Cham. https://doi.org/10.1007/978-3-030-75162-3_4

Download citation

Publish with us

Policies and ethics