Abstract
The COVID-19 pandemic is one of the biggest health crises of the twenty-first century, it has completely affected society’s daily life, and has impacted populations worldwide, both economically and socially. The use of machine learning algorithms to study data from the COVID-19 pandemic has been quite frequent in the most varied articles published in recent times. In this paper, we will analyze the impact of several variables (number of cases, temperature, people vaccinated, people fully vaccinated, number of vaccinations, and boosters) on the number of deaths caused by COVID-19 or SARS-CoV-2 in Portugal and find the most appropriate predictive model. Various algorithms were used, such as OLS, Ridge, LASSO, MLP, Gradient Boosting, and Random Forest. The method used for data processing was Cross- Industry Standard Process for Data Mining (CRISP-DM). The data was obtained from an open-access database.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Almalki A, Gokaraju B, Acquaah Y, Turlapaty A (2022) Regression analysis for COVID-19 infections and deaths based on food access and health issues. Healthcare 10(2):324. https://doi.org/10.3390/healthcare10020324
Rustagi V, Bajaj M, Tanvi, Singh P, Aggarwal R, AlAjmi MF, Hussain A, Hassan MdI, Singh A, Singh IK (2022) Analyzing the effect of vaccination over COVID cases and deaths in Asian countries using machine learning models. Front Cell Infect Microbiol 11. https://doi.org/10.3389/fcimb.2021.806265
Sarirete A (2021) A bibliometric analysis of COVID-19 vaccines and sentiment analysis. Proc Comput Sci 194:280–287. https://doi.org/10.1016/j.procs.2021.10.083
Sohrabi C, Alsafi Z, O’Neill N, Khan M, Kerwan A, Al-Jabir A, Iosifidis C, Agha R (2020) World Health Organization declares global emergency: a review of the 2019 novel coronavirus (COVID-19). Int J Surg 76:71–76. https://doi.org/10.1016/j.ijsu.2020.02.034
Milhinhos A, Costa PM (2020) On the progression of COVID-19 in Portugal: a comparative analysis of active cases using non-linear regression. Front Public Health 8. https://doi.org/10.3389/fpubh.2020.00495
Perone G (2022) Using the SARIMA model to forecast the fourth global wave of cumulative deaths from COVID-19: evidence from 12 hard-hit big countries. Econometrics 10:18. https://doi.org/10.3390/econometrics10020018
Aparicio JT, Romao M, Costa CJ (2022) Predicting bitcoin prices: the effect of interest rate, search on the internet, and energy prices. 17th Iberian conference on information systems and technologies (CISTI), Madrid, Spain, pp. 1–5. https://doi.org/10.23919/CISTI54924.2022.9820085
Aparicio JT, Salema de Sequeira, JT and Costa CJ (2021) Emotion analysis of Portuguese Political Parties Communication over the covid-19 Pandemic, 16th Iberian conference on information systems and technologies (CISTI), Chaves, Portugal, pp. 1–6. https://doi.org/10.23919/CISTI52073.2021.9476557
Cord M, Cunningham P (2008) Machine learning techniques for multimedia: case studies on organization and retrieval. Springer Science & Business Media
Zhu X (Jerry) (2005) Semi-supervised learning literature survey. University of Wisconsin-Madison, Department of Computer Sciences
Mendelson S, Smola AJ (eds) (2003) Advanced lectures on machine learning: machine learning summer school 2002, Canberra, Australia, February 11–22, 2002: revised lectures. Springer, Berlin, New York
Saleh H, Layous J (2022) Machine learning—regression Thesis for: 4th year seminar higher institute for applied sciences and technology
Gumaei A, Al-Rakhami M, Mahmoud Al Rahhal M, Raddah H, Albogamy F, Al Maghayreh E, AlSalman H (2020) Prediction of COVID-19 confirmed cases using gradient boosting regression method. Computers, Materials & Continua, 66(1):315–329. https://doi.org/10.32604/cmc.2020.012045
Shrivastav LK, Jha SK (2021) A gradient boosting machine learning approach in modeling the impact of temperature and humidity on the transmission rate of COVID-19 in India. Appl Intell 51, 2727–2739 (2021). https://doi.org/10.1007/s10489-020-01997-6
Borghi PH, Zakordonets O, Teixeira JP (2021) A COVID-19 time series forecasting model based on MLP ANN. Proc Comput Sci 181:940–947. https://doi.org/10.1016/j.procs.2021.01.250
Gupta KV, Gupta A, Kumar D, Sardana A (2021) Prediction of COVID-19 confirmed, death, and cured cases in India using random forest model in big data mining and analytics, 4(2):116–123. https://doi.org/10.26599/BDMA.2020.9020016.4
Yeşilkanat CM (2020) Spatio-temporal estimation of the daily cases of COVID-19 in worldwide using random forest machine learning algorithm. Chaos Solitons Fractals 140:110210. https://doi.org/10.1016/j.chaos.2020.110210
COVID-19 Data Explorer. https://ourworldindata.org/coronavirus-data-explorer. Accessed 2022/07/05
Menne MJ, Durre I, Korzeniewski B, McNeill S, Thomas K, Yin X, Anthony S, Ray R, Vose RS, Gleason BE, Houston TG (2012) Global historical climatology network—daily (GHCN-Daily), Version 3. https://www.ncei.noaa.gov/metadata/geoportal/rest/metadata/item/gov.noaa.ncdc:C00861/html
Costa C, Aparício JT (2020) POST-DS: a methodology to boost data science, 15th Iberian conference on information systems and technologies (CISTI), Seville, Spain, pp. 1–6. https://doi.org/10.23919/CISTI49556.2020.9140932
Haas EJ, McLaughlin JM, Khan F, Angulo FJ, Anis E, Lipsitch M, Singer SR, Mircus G, Brooks N, Smaja M, Pan K, Southern J, Swerdlow DL, Jodar L, Levy Y, Alroy-Preis S (2022) Infections, hospitalisations, and deaths averted via a nationwide vaccination campaign using the Pfizer–BioNTech BNT162b2 mRNA COVID-19 vaccine in Israel: a retrospective surveillance study. Lancet Infect Dis 22:357–366. https://doi.org/10.1016/S1473-3099(21)00566-1
Albon C (2018) Machine learning with Python cookbook: practical solutions from preprocessing to deep learning. O’Reilly Media, Inc
Dyer O (2021) Covid-19: Moderna and Pfizer vaccines prevent infections as well as symptoms, CDC study finds. BMJ n888. https://doi.org/10.1136/bmj.n888
Avila J, Hauck T (2017) Scikit-learn cookbook: over 80 recipes for machine learning in Python with scikit-learn. Packt Publishing Ltd
Seabold S, Perktold J (2010) Statsmodels: econometric and statistical modeling with python. Proceedings of the 9th Python in science conference (SciPy 2010) Austin, Texas. https://doi.org/10.25080/Majora-92bf1922-011
Kim TK (2015) T test as a parametric statistic. Korean J Anesthesiol 68:540–546. https://doi.org/10.4097/kjae.2015.68.6.540
Mckinney W, Perktold J, Seabold S (2011) Time series analysis in Python with statsmodels Proceedings of the 10th Python in science conference (SciPy 2011). https://doi.org/10.25080/Majora-ebaa42b7-012
Hyndman RJ, Athanasopoulos G (2018) Forecasting: principles and practice. 2nd edition, OTexts: Melbourne, Australia
Akossou A, Palm R (2013) Impact of data structure on the estimators R-square and adjusted R-square in linear regression. Int J Math Comput 20:84–93
Acknowledgements
We gratefully acknowledge financial support from FCT—Fundação para a Ciência e a Tecnologia (Portugal), national funding through research grant UIDB/04521/2020.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Arriaga, A., Costa, C.J. (2023). Modeling and Predicting Daily COVID-19 (SARS-CoV-2) Mortality in Portugal. In: Anwar, S., Ullah, A., Rocha, Á., Sousa, M.J. (eds) Proceedings of International Conference on Information Technology and Applications. Lecture Notes in Networks and Systems, vol 614. Springer, Singapore. https://doi.org/10.1007/978-981-19-9331-2_23
Download citation
DOI: https://doi.org/10.1007/978-981-19-9331-2_23
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-9330-5
Online ISBN: 978-981-19-9331-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)