Skip to main content
Log in

Implementing Machine Learning Methods in Estimating the Size of the Non-observed Economy

  • Published:
Computational Economics Aims and scope Submit manuscript

Abstract

Even though the literature on unregistered economic activity is growing at an increasing rate, we commonly encounter simple ordinary least squares methods and panel regressions, largely ignoring the recent rapid developments in machine learning methods. This study provides a new approach to more accurately estimate the size of the non-observed economy using machine learning methods. Compared to two currency demand-based models used to estimate the size of the non-observed economy, we show that a Random Forest algorithm can more accurately estimate the demand for currency, which is known to provide a fair estimation of the unregistered economic activity. The proposed approach shows superior forecasting capabilities compared to the current state-of-the-art linear regression-based methods dedicated to estimating non-observed economic activity.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Alzubaidi, L., Zhang, J., Humaidi, A. J., Al-Dujaili, A., Duan, Y., Al-Shamma, O., Santamaría, J., Fadhel, M. A., Al-Amidie, M., & Farhan, L. (2021). Review of deep learning: Concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data, 8(1), 1–74.

    Article  Google Scholar 

  • Andreas, L. P., & Salvatore, J. S. (2001). Cost complexity-based pruning of ensemble classifiers. Knowledge and Information Systems, 3, 449–469.

    Article  Google Scholar 

  • Andrews, D., Sánchez, A. C., & Johansson, Å. (2011). Towards a better understanding of the informal economy. OECD Publishing.

    Google Scholar 

  • Ardizzi, G., Petraglia, C., Piacenza, M., & Turati, G. (2014). Measuring the underground economy with the currency demand approach: A reinterpretation of the methodology, with an application to Italy. Review of Income and Wealth, 60(4), 747–772.

    Article  Google Scholar 

  • Athey, S., & Imbens, G. W. (2019). Machine learning methods that economists should know about. Annual Review of Economics, 11(1), 685–725.

    Article  Google Scholar 

  • Belgiu, M., & Draguţ, L. (2016). Random forest in remote sensing: A review of applications and future directions. ISPRS Journal of Photogrammetry and Remote Sensing, 114, 24–31.

    Article  Google Scholar 

  • Blades, D., & Roberts, D. (2002). Measuring the non-observed economy statistics. OECD, Statistics Brief, 5, 458.

    Google Scholar 

  • Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

    Article  Google Scholar 

  • Breusch, T. (2005). Estimating the underground economy using MIMIC models. National University of Australia.

    Google Scholar 

  • Breusch, T. (2005). The Canadian underground economy: An examination of Giles and Tedds. Canadian Tax Journal, 53(2), 367.

    Google Scholar 

  • Cantekin, K., & Elgin, C. (2017). Extent and growth effects of informality in Turkey: Evidence from a firm-level survey. The Singapore Economic Review, 62(05), 1017–1037.

    Article  Google Scholar 

  • Cook, R. D. (1977). Detection of influential observation in linear regression. Technometrics, 19(1), 458.

    Google Scholar 

  • Dybka, P., B. Olesiński, M. Rozkrut, and A. Torój (2020). Measuring the uncertainty of shadow economy estimates using bayesian and frequentist model averaging. Working Paper 2020/046, Szkoła Główna Handlowa W Warszawie.

  • Dybka, P., Kowalczuk, M., Olesiński, B., Torój, A., & Rozkrut, M. (2019). Currency demand and MIMIC models: Towards a structured hybrid method of measuring the shadow economy. International Tax and Public Finance, 26(1), 4–40.

    Article  Google Scholar 

  • Elgin, C. and O. Oztunali (2012). Shadow economies around the world: Model based estimates. Working Papers 2012/05, Bogazici University, Department of Economics.

  • Elgin, C., & Erturk, F. (2019). Informal economies around the world: Measures, determinants and consequences. Eurasian Economic Review, 9(2), 221–237.

    Article  Google Scholar 

  • Elgin, C., & Schneider, F. (2016). Shadow economies in OECD countries: DGE versus MIMIC approaches. Bogazici Journal Review of Social Economic Administrative Studies, 30(1), 1–32.

    Google Scholar 

  • Enste, D., & Schneider, F. (2002). The shadow economy: Theoretical approaches, empirical studies, and political implications. Cambridge University Press.

    Google Scholar 

  • Esling, P., & Agon, C. (2012). Time-series data mining. ACM Computer Survey, 45(1), 142.

    Article  Google Scholar 

  • Feige, E. L. (2016). Reflections on the meaning and measurement of unobserved economies: What do we really know about the shadow economy. Journal of Tax Administration, 2, 124.

    Google Scholar 

  • Feld, L. P. and C. Larsen (2012). The size of the German shadow economy and tax morale according to various methods and definitions. In Undeclared Work, Deterrence and Social Norms, (pp. 15–20). Springer.

  • Feld, L. P., & Schneider, F. (2010). Survey on the shadow economy and undeclared earnings in OECD countries. German Economic Review, 11(2), 109–149.

    Article  Google Scholar 

  • Ferwerda, J., I. Deleanu, and B. Unger (2010). Revaluating the Tanzi-model to estimate the underground economy. Discussion Paper Series/Tjalling C. Koopmans Research Institute 10(04).

  • Frey, B. S., & Weck, H. (1983). Estimating the shadow economy: A ‘naive’ approach. Oxford Economic Papers, 35(1), 23–44.

    Article  Google Scholar 

  • Gogas, P., Papadimitriou, T., & Sofianos, E. (2022). Forecasting unemployment in the Euro area with machine learning. Journal of Forecasting, 41(3), 551–566.

    Article  Google Scholar 

  • Gyomai, G., & van de Ven, P. (2014). The non-observed economy in the system of national accounts. OECD Statistics Brief, 18, 1–12.

    Google Scholar 

  • Ha, L. T., Dung, H. P., & Thanh, T. T. (2021). Economic complexity and shadow economy: A multi-dimensional analysis. Economic Analysis and Policy, 72, 408–422.

    Article  Google Scholar 

  • Heffetz, Y., R. Vainshtein, G. Katz, and L. Rokach (2020). Deepline: Automl tool for pipelines generation using deep reinforcement learning and hierarchical actions filtering. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, (pp. 2103–2113).

  • HongXing, Y., Naveed, H. M., Memon, B. A., Ali, S., Haris, M., Akhtar, M., & Mohsin, M. (2023). Connectedness between currency risk hedging and firm value: A deep neural network-based evaluation. Computational Economics, 24, 753.

    Google Scholar 

  • Kalousis, A., J. Prados, and M. Hilario (2005). Stability of feature selection algorithms. In Fifth IEEE international conference on data mining (ICDM’05).

  • Kaufmann, D., & Kaufmann, A. (1996). Integrating the unofficial economy into the dynamics of post-socialist economies a framework of analysis and evidence. The World Bank.

    Google Scholar 

  • Kavitha, S., S. Varuna, and R. Ramya (2016). A comparative analysis on linear regression and support vector regression. In 2016 Online International Conference on Green Engineering and Technologies (IC-GET), (pp. 1–5).

  • Kirchgässner, G. (2017). On estimating the size of the shadow economy. German Economic Review, 18(1), 99–111.

    Article  Google Scholar 

  • Kohavi, R. (1995). A study of cross validation and bootstrap for accuracy estimation and model select. In International Joint Conference on Artificial Intelligence.

  • Lazebnik, T., Bahouth, Z., Bunimovich-Mendrazitsky, S., & Halachmi, S. (2022). Predicting acute kidney injury following open partial nephrectomy treatment using sat-pruned explainable machine learning model. BMC Medical Informatics and Decision Making, 22, 133.

    Article  Google Scholar 

  • Liu, R., Liu, E., Yang, J., Li, M., & Wang, F. (2006). Optimizing the hyper-parameters for SVM by combining evolution strategies with a grid search. Intelligent Control and Automation, 344, 485.

    Google Scholar 

  • Mahouti, P., Gunes, F., Belen, M. A., & Demirel, S. (2021). Symbolic regression for derivation of an accurate analytical formulation using “big data’’: An application example. The Applied Computational Electromagnetics Society Journal, 32(5), 372–380.

    Google Scholar 

  • Masini, R. P., Medeiros, M. C., & Mendes, E. F. (2021). Machine learning advances for time series forecasting. Journal of Economic Surveys, 52, 354.

    Google Scholar 

  • Medeiros, M. C., Vasconcelos, G. F., Veiga, Á., & Zilberman, E. (2021). Forecasting inflation in a data-rich environment: The benefits of machine learning methods. Journal of Business and Economic Statistics, 39(1), 98–119.

    Article  Google Scholar 

  • Natan, S., Lazebnik, T., & Lerner, E. (2022). A distinction of three online learning pedagogic paradigms. SN Social Sciences, 2, 46.

    Article  Google Scholar 

  • Nosratabadi, S., Mosavi, A., Duan, P., Ghamisi, P., Filip, F., Band, S. S., Reuter, U., Gama, J., & Gandomi, A. (2020). Data science in economics: Comprehensive review of advanced machine learning and deep learning methods. Mathematics, 8, 1799.

    Article  Google Scholar 

  • Ozmen, A., Kropat, E., & Weber, G.-W. (2016). Robust optimization in spline regression models for multi-model regulatory networks under polyhedral uncertainty. Optimization, 12, 2135–2155.

    Google Scholar 

  • Paruchuri, H. (2021). Conceptualization of machine learning in economic forecasting. Asian Business Review, 11(2), 51–58.

    Article  Google Scholar 

  • Rogoff, K. (2015). Costs and benefits to phasing out paper currency. NBER Macroeconomics Annual, 29(1), 445–456.

    Article  Google Scholar 

  • Rokach, L. (2016). Decision forest: Twenty years of research. Information Fusion, 27, 111–125.

    Article  Google Scholar 

  • Savchenko, E., & Lazebnik, T. (2022). Computer aided functional style identification and correction in modern Russian texts. Journal of Data, Information and Management, 4, 25–32.

    Article  Google Scholar 

  • Savku, E. (2023). A stochastic control approach for constrained stochastic differential games with jumps and regimes. arXiv.

  • Schneider, F., & Buehn, A. (2016). Estimating the size of the shadow economy: Methods, problems and open questions. Institute for the Study of Labor (IZA).

    Google Scholar 

  • Schneider, F., & Buehn, A. (2018). Shadow economy: Estimation methods, problems, results and open questions. Open Economics, 1(1), 1–29.

    Article  Google Scholar 

  • Schneider, F., Buehn, A., & Montenegro, C. E. (2010). New estimates for the shadow economies all over the world. International Economic Journal, 24(4), 443–461.

    Article  Google Scholar 

  • Schneider, F., & Enste, D. H. (2000). Shadow economies: Size, causes, and consequences. Journal of Economic Literature, 38(1), 77–114.

    Article  Google Scholar 

  • Shami, L., G. Cohen, O. Akirav, A. Herscovici, L. Yehuda, and S. Barel-Shaked (2021). Informal self-employment within the non-observed economy of Israel. Furthcoming in: International Journal of Entrepreneurship and Small Business.

  • Shami, L. (2019). Dynamic monetary equilibrium with a non-observed economy and Shapley and Shubik’s price mechanism. Journal of Macroeconomics, 62, 103018.

    Article  Google Scholar 

  • Shami, L. (2020). The non-observed economy in Israel. Taub Center for Social Policy Studies in Israel.

    Google Scholar 

  • Simon Keren, L., Liberzon, A., & Lazebnik, T. (2023). A computational framework for physics-informed symbolic regression with straightforward integration of domain knowledge. Scientific Reports, 13, 1249.

    Article  Google Scholar 

  • Stegun, I., & Abramowitz, M. (1964). Handbook of Mathematical Functions. National Institute of Standards and Technology: United States Department of Commerce.

    Google Scholar 

  • Stijven, S., Vladislavleva, E., Kordon, A., Willem, L., & Kotanchek, M. E. (2016). Prime-time: Symbolic regression takes its place in the real world. Genetic and Evolutionary Computation: Genetic Programming Theory and Practice XIII.

    Google Scholar 

  • Thai, M. T. T., & Turkina, E. (2013). Entrepreneurship in the informal economy: Models, approaches and prospects for economic development. Routledge.

    Book  Google Scholar 

  • Udrescu, S.-M., & Tegmark, M. (2020). AI Feynman: A physics-inspired method for symbolic regression. Science Advances, 6(16), eaay2631.

    Article  Google Scholar 

  • Weber, G.-W., Defterli, O., Gok, S. Z. A., & Kropat, E. (2011). Modeling, inference and optimization of regulatory networks based on time series data. European Journal of Operational Research, 211(1), 1–14.

    Article  Google Scholar 

  • Weck, H. (1983). Schattenwirtschaft: Eine Möglichkeit zur Einschränkung der öffentlichen Verwaltung? eine ökonomische Analyse. Frankfurt/Main: Lang.

    Google Scholar 

  • Yoon, J. (2021). Forecasting of real GDP growth using machine learning models: Gradient boosting and random forest approach. Computational Economics, 57, 247–265.

    Article  Google Scholar 

Download references

Funding

The authors have not disclosed any funding.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Labib Shami.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interests

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file 1 (xlsx 1158 KB)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shami, L., Lazebnik, T. Implementing Machine Learning Methods in Estimating the Size of the Non-observed Economy. Comput Econ 63, 1459–1476 (2024). https://doi.org/10.1007/s10614-023-10369-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10614-023-10369-4

Keywords

JEL Classification:

Navigation