Comparison of ARIMA, ETS, NNAR, TBATS and hybrid models to forecast the second wave of COVID-19 hospitalizations in Italy

Perone, Gaetano

doi:10.1007/s10198-021-01347-4

Comparison of ARIMA, ETS, NNAR, TBATS and hybrid models to forecast the second wave of COVID-19 hospitalizations in Italy

Original Paper
Open access
Published: 04 August 2021

Volume 23, pages 917–940, (2022)
Cite this article

Download PDF

You have full access to this open access article

The European Journal of Health Economics Aims and scope Submit manuscript

Comparison of ARIMA, ETS, NNAR, TBATS and hybrid models to forecast the second wave of COVID-19 hospitalizations in Italy

Download PDF

Gaetano Perone ORCID: orcid.org/0000-0002-0614-6727¹

7272 Accesses
50 Citations
3 Altmetric
Explore all metrics

Abstract

The coronavirus disease (COVID-19) is a severe, ongoing, novel pandemic that emerged in Wuhan, China, in December 2019. As of January 21, 2021, the virus had infected approximately 100 million people, causing over 2 million deaths. This article analyzed several time series forecasting methods to predict the spread of COVID-19 during the pandemic’s second wave in Italy (the period after October 13, 2020). The autoregressive moving average (ARIMA) model, innovations state space models for exponential smoothing (ETS), the neural network autoregression (NNAR) model, the trigonometric exponential smoothing state space model with Box–Cox transformation, ARMA errors, and trend and seasonal components (TBATS), and all of their feasible hybrid combinations were employed to forecast the number of patients hospitalized with mild symptoms and the number of patients hospitalized in the intensive care units (ICU). The data for the period February 21, 2020–October 13, 2020 were extracted from the website of the Italian Ministry of Health (www.salute.gov.it). The results showed that (i) hybrid models were better at capturing the linear, nonlinear, and seasonal pandemic patterns, significantly outperforming the respective single models for both time series, and (ii) the numbers of COVID-19-related hospitalizations of patients with mild symptoms and in the ICU were projected to increase rapidly from October 2020 to mid-November 2020. According to the estimations, the necessary ordinary and intensive care beds were expected to double in 10 days and to triple in approximately 20 days. These predictions were consistent with the observed trend, demonstrating that hybrid models may facilitate public health authorities’ decision-making, especially in the short-term.

Stochastic forecasting of COVID-19 daily new cases across countries with a novel hybrid time series model

Article 13 January 2022

A Comparative Study of Autoregressive and Neural Network Models: Forecasting the GARCH Process

Forecasting the COVID-19 Spread in Iran, Italy, and Mexico Using Novel Nonlinear Autoregressive Neural Network and ARIMA-Based Hybrid Models

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

The coronavirus disease (COVID-19) is a severe, ongoing, novel pandemic that officially emerged in Wuhan, China, in December 2019. As of January 21, 2021, it had affected 219 countries and territories with almost 100 million cases and over 2 million deaths [77]. At the time of writing, the countries most significantly affected include both advanced and developing countries, such as Brazil, France, India, Italy, Russia, Spain, the UK, and the US. From October to December 2020, several European countries, including Italy, saw a worrisome surge of COVID-19 infections.

Italy was the first European country to be severely impacted by COVID-19, and it remained one of the main epicenters of the pandemic for approximately 2 months, i.e., from mid-February 2020 to mid-April 2020. After that first peak, the pandemic curve progressively decreased until mid-August 2020. However, the spread of infection accelerated again in the late Summer and early Fall of 2020, and this second surge continues today. As of January 21, 2021, Italy has suffered 84,202 deaths and 2,428,221 cases.

The likelihood of new and consecutive COVID-19 waves is real, and efforts to study the pandemic’s trajectory are imperative to purchase medical devices and healthcare facilities and to manage health centers, clinics, hospitals, and ordinary and intensive care beds.

Thus, the first goal of this paper is to provide short-term and mid-term forecasts for the number of patients hospitalized with COVID-19 during the second wave of COVID-19 infections, i.e., during the period after October 13, 2020. COVID-19-related hospitalization trends offer a clear picture of the overall pressure on the national healthcare system. Moreover, models fitted to hospitalized patients are usually more reliable and accurate than models fitted to confirmed cases [30].^{Footnote 1} The paper’s second goal is to compare and investigate the accuracy of several statistical methods.

In particular, I estimated four time series forecast techniques and all of their feasible hybrid combinations: the autoregressive moving average (ARIMA) model, innovations state space models for exponential smoothing (ETS), the neural network autoregression (NNAR) model, and the trigonometric exponential smoothing state space model with Box–Cox transformation, ARMA errors, and trend and seasonal components (TBATS).

The rest of this paper is organized as follows. “Related literature” reviews the relevant literature while “Materials and methods” presents the data used in the analysis and discusses the empirical strategy. “Evaluation metrics” presents the evaluation metrics used to measure the performance of the models. “Results and discussion” discusses the main findings and policy implications. Finally, “Conclusions” provides some conclusive considerations.

Related literature

From the beginning of 2020, an increasing body of literature has employed various approaches to forecast the spread of the COVID-2019 outbreak [9, 22, 26, 58, 73, 78, 79, 83, 85]. The most frequently used were ARIMA models [3, 8, 14, 62], ETS models [13, 44], artificial neural network (ANN) models [55, 75], TBATS models [68, 71], models derived from the susceptible–infected–removed (SIR) basic approach [22, 26, 58, 78, 85], and hybrid models [15, 29, 68, 69]. The implementation and comparison of these approaches—with the exception of mechanistic–statistical models (such as SIR)—represents the core of this paper.

Ala’raj et al. [2] utilized a dynamic hybrid model based on a modified susceptible–exposed–infected–recovered–dead (SEIRD) model with ARIMA corrections of the residuals. They provided long-term forecasts for infected, recovered, and deceased people using a US COVID-19 dataset, and their model had a remarkable ability to make accurate predictions. Using a nonseasonal ARIMA model, Ceylan [14] made short-term predictions of cumulative confirmed cases after April 15, 2020, for France, Italy, and Spain. The forecasts showed low mean absolute percentage errors (MAPE) and seemed to be sufficiently reliable and suitable for the short-term epidemiological analysis of COVID-19 trends.

Hasan [29] proposed a hybrid model that incorporates ensemble empirical mode decomposition (EEMD) and neural networks to forecast real-time global COVID-19 cases for the period after May 18, 2020. The analysis showed that the ANN-EEMD approach was quite promising and outperformed traditional statistical methods, such as regression analysis and moving average.

Ribeiro et al. [65] provided short-term estimates of COVID-19 cumulative confirmed cases in Brazil by employing multiple approaches and selecting several models, such as ARIMA, cubist regression (CUBIST), random forest (RF), ridge regression (RIDGE), support vector regression (SVR), and stacking-ensemble learning (SEL). The models’ reliabilities were evaluated based on the improvement index, mean absolute error (MAE), and symmetric MAPE criteria. The analysis demonstrated that SVR and SEL performed best, but all models exhibited good forecasting performances.

Using ARIMA, TBATS, their statistical hybrid, and a mechanistic mathematical model combining the best of the previous models, Sardar et al. [68] attempted to forecast daily COVID-19 confirmed cases across India and in five different states (Delhi, Gujarat, Maharashtra, Punjab, and Tamil Nadu) from May 17, 2020, until May 31, 2020. The ensemble model showed the best prediction skills and suggested that COVID-19 that daily COVID-19 cases would significantly increase in the considered forecast window and that lockdown measures would be more effective in states with the highest percentages of symptomatic infection.

Wieczorek et al. [75] implemented deep neural network architectures, which learned by using a Nesterov-accelerated adaptive moment (Nadam) training model, to forecast cumulative confirmed COVID-19 cases in several countries and regions. The predictions, which referred to different time windows, revealed that the models had an extremely high level of accuracy (approximately 87.7% for most regions but, in some cases, reaching almost 100%).

Talkhi et al. [71] attempted to forecast the number of COVID-19 confirmed infections and deaths in Iran between August 15, 2020, and September 14, 2020, using several single and hybrid models. The extreme learning machine (ELM) and hybrid ARIMA–NNAR models were the most suitable for forecasting confirmed cases, while the Holt–Winters (HW) approach outperformed the others in predicting death cases.

Finally, Table 1 reports 30 international studies that utilized single or hybrid ARIMA, ETS, neural network, and TBATS models to forecast the transmission patterns of COVID-19 across the world.

Table 1 30 selected international studies that utilized single or hybrid ARIMA, ETS, neural network, and TBATS models.

Comparison of ARIMA, ETS, NNAR, TBATS and hybrid models to forecast the second wave of COVID-19 hospitalizations in Italy

Abstract

Similar content being viewed by others

Stochastic forecasting of COVID-19 daily new cases across countries with a novel hybrid time series model

A Comparative Study of Autoregressive and Neural Network Models: Forecasting the GARCH Process

Forecasting the COVID-19 Spread in Iran, Italy, and Mexico Using Novel Nonlinear Autoregressive Neural Network and ARIMA-Based Hybrid Models

Introduction

Related literature

Materials and methods

Evaluation metrics

Results and discussion

Conclusions

Availability of data and material

Code availability

Notes

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Consent to participate

Consent for publication

Additional information

Publisher's Note

Appendices

Appendix A

Appendix B

Appendix C

Appendix D

Rights and permissions

About this article

Cite this article

Share this article

Keywords

JEL Classification

Search

Navigation