Abstract
This paper proposes a strategy for nowcasting tourist overnight stays in Italy by exploiting payment card data and Google Search indices. The strategy is applied to national and regional overnight stays at a time of a significant and unanticipated shock to tourism flows and payment habits (the COVID-19 pandemic). Our results show that indicators based on payment data are very informative for predicting tourist volumes, both at the national and at the regional level. Instead, the predictive power of Google Search data is more limited.
Similar content being viewed by others
Notes
Other estimates produced by Istat (2020) indicate that the direct and indirect contribution to the value added are around 6 and 15 per cent, respectively.
Examples include publicly financed advertisement campaigns or local transport organizations. More in general, following the constitutional reform of 2001, the regional administrations have seen an extension of their powers on many tourism issues.
This means that until June of a given year, the regional data are only available until December of two years before.
According to the psychological literature, people travel because they are “pushed” into making travel decisions by internal, psychological forces, and “pulled” by the external forces of the destination attributes (Yoon and Uysal 2005).
Overnight stays are the product of the number of the arrivals and the number of nights spent per each tourist arrived. In the estimations, we do not consider arrivals, but we focus exclusively on overnight stays. Over the short run, the dynamic of the two statistics is very similar; over the last ten years, arrivals have grown more because the average number of nights spent per tourist has declined sharply.
Regulated by the “Testo unico di Pubblica Sicurezza” (art.109).
Due according to the Regulation EU 692/2011.
The product categories are: (1) clothing, (2) hotels and restaurants, (3) food, (4) home, (5) cash advance, (6) work, (7) retail, (8) services, (9) telephony, (10) travel and transport, (11) not defined.
In 2021, Google Trends had around 92 per cent of the worldwide search engine marked share (Statcounter 2017).
More precisely, the counts, available on a monthly basis, are reported only if exceeding an unknown threshold based on the geographical location, and are measured with a 0–100 index (normalized on the chosen time window). Importantly, the series generated by GT does not provide absolute numbers of searches, but a relative frequency of them. They represent the popularity of the searches for a keyword with respect to the total searches in the geographical area and time period selected, measured in relative terms.
The classification into categories is done by an algorithm set-up by Google about which Google does not release details.
For instance, for Veneto, we consider the top-10 destinations specified in the following webpage: https://www.tripadvisor.it/Tourism-g187866-Veneto-Vacations.html.
In each case we use the scree plot to select the number of PCs, which, as a consequence, may differ according to the area of interest.
Provisional data are delivered from Istat to Eurostat within 56 days from the end of the previous month. Eurostat publishes the data some days after, but they are then revised within four months: at that date, the statistics are published by Istat, too.
Unfortunately, as already pointed out, data on payments are not available before 2014.
Note that, by definition, \(Ly_t=y_{t-1}\), and \(L^my_t=y_{t-m}\).
It assumes that future values of the target linearly depend on its past values and on the values of past (stochastic) shocks.
The reason why step (A) considers as target the time series of the shares, in place of the more natural one in levels, is that we do not want to use \(y_{t,IT}\) as exogenous variable in the procedure, because it is too correlated with the \(y_{t,r}\). As a consequence, the impact of the other covariates would be artificially smaller. With this model instead, we are able to capture both the correlation between \(y_{t,r}\) and \(y_{t,IT}\), and the residual correlation between \(y_{t,r}\) and the other regressors.
The number of principal components to include in the model is selected for each different year and region, visually, by detecting an elbow in the scree plot. Contrary to the other variables, the principal components of the GT series are included in levels, because GT series are provided by Google as indices and can not be recalculated in regional over national shares.
The algorithm combines unit root tests, minimization of the Akaike Information Criterion (AIC) and MLE to obtain an ARIMA model.
Recall that in the regional exercise we consider the months from June to September included. In the national case we exclude June, because official data for that month are already available in October of the same year.
This monthly index corresponds to the total amount of acquiring transactions (made on a Pago Italian POS) in the tourist industry categories (hotel and restaurant plus travel and transport) made by cards issued by Italian banks.
This corresponds to the total amount of acquiring transactions (made on a Pago Italian POS) in the tourist industry categories (hotel and restaurant plus travel and transport) made by cards issued by any national or international bank.
This model is equivalent to including both Pago Ita and Pago foreigners = Pago all-Pago Ita.
In practice, this happens because the total number of payments by cards issued by foreign banks is more stable in the years considered, while the payments by Italian-issued cards show a positive trend (data not shown for confidentiality reasons).
References
Aagesen HW, Levlin A, Ojansuu S, Redding A, Muukkonen P, Järv O (2020) Using Twitter data to evaluate tourism in Finland—a comparison with official statistics. Examples and progress in geodata science
Aastveit KA, Fastbø TM, Granziera E, Paulsen KS, Torstensen KN (2020) Nowcasting Norwegian household consumption with debit card transaction data. Norges Bank
Aladangady A, Aron-Dine S, Dunn W, Feiveson L, Lengermann P, Sahm C (2021) From transactions data to economic statistics: constructing real-time, high-frequency. University of Chicago Press, Geographic Measures of Consumer Spending
Antolini F, Grassini L (2019) Foreign arrivals nowcasting in Italy with Google Trends data. Quality and quantity. Springer, p 53
Aprigliano V, Ardizzi G, Monteforte L (2019) Using payment system data to forecast economic activity. Int J Cent Bank 15(4):55–80
Ardizzi G, Nobili A, Rocco G (2021) A game changer in payment habits: Evidence from daily data during a pandemic. Social Science Research Network
Arias JM, de Dios Romero Palop J, Bodas Sagi DJ, Lapaz HV (2018) Using transactional data to determine the usual environment of cardholders. In: Information and Communication Technologies in Tourism 2018: Proceedings of the International Conference in Jönköping, Sweden, January 24-26, 2018, pages 515–527. Springer
Artola C, Martínez-Galán E (2012) Tracking the future on the web: construction of leading indicators using internet searches. Banco de Espana Occasional Paper, (1203)
Askitas N, Zimmermann K, Askitas N (2009) Google econometrics and unemployment forecasting. Appl Econ Q (Formerly: Konjunkturpolitik) 55:107–120
Bangwayo-Skeete PF, Skeete RW (2015) Can Google data improve the forecasting performance of tourist arrivals? Mixed-data sampling approach. Tour Manag 46:454–464
Box GE, Jenkins GM, Reinsel GC, Ljung GM (2015) Time series analysis: forecasting and control. Wiley
Breiman L (2001) Random forests. Mach Learn 45:5–32
Camacho M, Pacce MJ (2018) Forecasting travellers in Spain with Google’s search volume indices. Tour Econ 24(4):434–448
Carboni A, Catalano C, Doria C (2023) How can big data improve the quality of tourism statistics? The Bank of Italy’s experience in compiling the travel item of the Balance of Payments. In: for International Settlements, B., editor, Post-pandemic landscape for central bank statistics, volume 58 of IFC Bulletins chapters. Bank for International Settlements
Chatfield C (2000) Time-series forecasting. CRC Press, Boca Raton
Choi H, Varian H (2012) Predicting the present with Google Trends. Econ Record 88:2–9
Chua A, Servillo L, Marcheggiani E, Moere AV (2016) Mapping Cilento: using geotagged social media data to characterize tourist flows in southern Italy. Tour Manag 57:295–310
Croushore D, Ruiz E, Scaglione M (2013) Introduction to flash indicators. Int J Forecast 29:642–643
Dagum EB, Bianconcini S (2016) Seasonal adjustment methods and real time trend-cycle estimation. Springer, Cham
D’Amuri F, Marcucci J (2017) The predictive power of Google searches in forecasting US unemployment. Int J Forecast 33(4):801–816
De Gooijer JG, Hyndman RJ (2006) 25 years of time series forecasting. Int J Forecast 22(3):443–473
de Kort RE (2017) Forecasting tourism demand through search queries and machine learning. IFC Bulletins, 44
Della Corte V, Doria C, Oddo G (2021) The impact of Covid-19 on international tourism flows to Italy: evidence from mobile phone data
Della Penna N, Huang H (2009) Constructing Consumer Sentiment Index for U.S. Using Google Searches. Working Papers 2009-26, University of Alberta, Department of Economics
Demma C (2021) Il settore turistico e la pandemia di Covid-19. Note Covid-19
Di Giacinto V, Monteforte L, Filippone A, Montaruli F, Ropele T (2019) ITER: a quartely indicator of regional economic activity in Italy. Questioni di Economia e Finanza, (489)
Eurostat (2021) Eurostat statistics explained. https://ec.europa.eu/eurostat/statistics-explained/index.php?title=Glossary:Nights_spent. Accessed: 26 Jul 2021
Feng Y, Li G, Sun X, Li J (2019) Forecasting the number of inbound tourists with Google Trends. Procedia Comput Sci 162:628–633
Fonzo TD, Marini M (2011) Simultaneous and two-step reconciliation of systems of time series: methodological and practical issues. J Roy Stat Soc Ser C (Appl Stat) 60(2):143–164
Galbraith JW, Tkacz G (2018) Nowcasting with payments system data. Int J Forecast 34(2):366–376
Giacomini R, Rossi B (2010) Forecast comparisons in unstable environments. J Appl Economet 25(4):595–620
Grassini L, Dugheri G (2021) Mobile phone data and tourism statistics: a broken promise. Natl Account Rev 3(1):50–68
Hardy A (2020) Tracking tourists: movement and mobility. Goodfellow Publishers Ltd
Havranek T, Zeynalov A (2019) Forecasting tourist arrivals: Google trends meets mixed-frequency data. Tourism Economics
Hawelka B, Sitko I, Euro Beinat SS, Kazakopoulos P, Ratti C (2014) Geo-located twitter as proxy for global mobility patterns. Cartogr Geogr Inf Sci 41(3):260–271
Holt CC (2004) Forecasting seasonals and trends by exponentially weighted moving averages. Int J Forecast 20(1):5–10
Hsieh S-C (2021) Tourism demand forecasting based on an lstm network and its variants. Algorithms 14(8):243
Hu M, Li H, Song H, Li X, Law R (2022) Tourism demand forecasting using tourist-generated online review data. Tour Manag 90:104490
Hyndman RJ, Khandakar Y (2008) Automatic time series forecasting: the forecast Package for R. J Stat Softw 27(3):1–22
Istat (2020) Conto satellite del Turismo per l’Italia, year 2017. Statistiche report. https://www.istat.it/it/files//2020/06/Conto-satellite-turismo.pdf. Accessed: 05 Jan 2022
Istat (2021) Occupancy in collective tourist accomodation. https://www.unwto.org/country-profile-outbound-tourism. Accessed: 20 Jul 2021
Ji-yuan W, Geng P, Shou-yang W (2017) Model selection on tourism forecasting: a comparison between Bayesian model averaging and Lasso. Afr J Bus Manag 11:158–167
Laaroussi H, Guerouate F, Sbihi M (2023) Incorporating deep learning and sentiment analysis on twitter data to improve tourism demand forecasting. In: Motahhir S, Bossoufi B (eds) Digital technologies and applications. Springer Nature Switzerland, Cham, pp 150–158
Lacasa L, Luque B, Ballesteros F, Luque J, Nuño JC (2008) From time series to complex networks: the visibility graph. Proc Natl Acad Sci 105(13):4972–4975
Law R, Li G, Fong DKC, Han X (2019) Tourism demand forecasting: a deep learning approach. Ann Tour Res 75:410–423
Li Y, Cao H (2018) Prediction for tourism flow based on LSTM neural network. Procedia Comput Sci 129:277–283
Li X, Pan B, Law R, Huang X (2017) Forecasting tourism demand with composite search index. Tour Manag 59:57–66
Li J, Xu L, Tang L, Wang S, Li L (2018) Big data in tourism research: a literature review. Tour Manag 68:301–323
Li X, Law R, Xie G, Wang S (2021) Review of tourism forecasting research with internet data. Tour Manag 83:104245
Mao S, Xiao F (2019) Time series forecasting based on complex network analysis. IEEE Access 7:40220–40229
Minora U, Iacus SM, Batista e Silva F, Sermi F, Spyratos S (2023) Nowcasting tourist nights spent using innovative human mobility data. Plos One 18(10):e0287063
Park S, Lee J, Song W (2017) Short-term forecasting of Japanese tourist inflow to South Korea using Google trends data. J Travel Tour Mark 34(3):357–368
Petrella A, Torrini R, Barone G, Beretta E, Breda E, Cappariello R, Ciaccio G, Conti L, David F, Degasperi P, Di Gioia A, Felettigh A, Filippone A, Firpo G, Gallo M, Guaitini P, Papini G, Passiglia P, Quintiliani F, Roma G, Romano V, Scalise D (2019) Turismo in Italia: numeri e potenziale di sviluppo. Questioni di Economia e Finanza 606:1–113
Petropoulos F, Apiletti D, Assimakopoulos V, Babai MZ, Barrow DK, Taieb SB, Bergmeir C, Bessa RJ, Bijak J, Boylan JE et al (2022) Forecasting: theory and practice. Int J Forecast 38(3):705–871
Provenzano D, Baggio R (2020) A complex network analysis of inbound tourism in Sicily. Int J Tour Res 22(4):391–402
Raun J, Ahas R, Tiru M (2016) Measuring tourism destinations using mobile tracking data. Tour Manag 57:202–212
Romero Palop JDD, Murillo Arias J, Bodas-Sagi DJ, Valero Lapaz H (2019) Determining the usual environment of cardholders as a key factor to measure the evolution of domestic tourism. Inform Technol Tour 21(1):23–43
Rossi B, Sekhposyan T (2010) Have economic models’ forecasting performance for US output growth and inflation changed over time, and when? Int J Forecast 26(4):808–835
Sainaghi R, Baggio R (2020) The effects generated by events on destination dynamics and topology. Curr Issue Tour 23(14):1788–1804
Saluveer E, Raun J, Tiru M, Altin L, Kroon J, Snitsarenko T, Aasa A, Silm S (2020) Methodological framework for producing national tourism statistics from mobile positioning data. Ann Tour Res 81:102895
Schmücker D, Reif J (2022) Measuring tourism with big data? Empirical insights from comparing passive GPS data and passive mobile data. Ann Tour Res Empir Insights 3(2):100061
Statcounter (2017) Search Engine Market Share Worldwide. https://gs.statcounter.com/search-engine-market-share#quarterly-200901-201702. Accessed: 20 Jul 2021
Sun S, Wei Y, Tsui K-L, Wang S (2019) Forecasting tourist arrivals with machine learning and internet search index. Tour Manag 70:1–10
UNWTO (2021) Data on outbound tourism by country. https://www.unwto.org/country-profile-outbound-tourism. Accessed: 20 Jul 2021
Verbaan R, Bolt W, van der Cruijsen C (2017) Using debit card payments data for nowcasting Dutch household consumption. DNB Working Papers 571, Netherlands Central Bank, Research Department
Webb G (2009) Internet search statistics as a source of business intelligence: Searches on foreclosure as an estimate of actual home foreclosures. Issues in Information Systems, 10
Wen T, Chen H, Cheong KH (2022) Visibility graph for time series prediction and image classification: a review. Nonlinear Dyn 110(4):2979–2999
Winters PR (1960) Forecasting sales by exponentially weighted moving averages. Manag Sci 6(3):324–342
Wu L, Brynjolfsson E (2015) The future of prediction: how google searches foreshadow housing prices and sales. Economic analysis of the digital economy. National Bureau of Economic Research, Inc, pp 89–118
Wu J, Li M, Zhao E, Sun S, Wang S (2023) Can multi-source heterogeneous data improve the forecasting performance of tourist arrivals amid COVID-19? Mixed-data sampling approach. Tour Manag 98:104759
Yang X, Pan B, Evans JA, Lv B (2015) Forecasting Chinese tourist volume with search engine data. Tour Manag 46:386–397
Yang Y, Fan Y, Jiang L, Liu X (2022) Search query and tourism forecasting during the pandemic: when and where can digital footprints be helpful as predictors? Ann Tour Res 93:103365
Yoon Y, Uysal M (2005) An examination of the effects of motivation and satisfaction on destination loyalty: a structural model. Tour Manag 26(1):45–56
Zhang R, Ashuri B, Shyr Y, Deng Y (2018) Forecasting construction cost index based on visibility graph: a network approach. Physica A 493:239–252
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
The views expressed in this paper are those of the authors and do not involve the responsibility of the Bank of Italy and/or the Eurosystem. We thank Matteo Alpino, Valentina Aprigliano, Laura Bartiloro, Andrea Carboni, Costanza Catalano, Andrea Doria, Simone Emiliozzi, Silvia Fabiani, Sara Lamboglia, Michele Loberto, Juri Marcucci, Alessandro Moro and Alfonso Rosolia for fruitful discussions and suggestions. We would also like to thank Marco Langiulli and Luca Bastianelli for providing useful information about the payment card data.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Crispino, M., Mariani, V. A Tool to Nowcast Tourist Overnight Stays with Payment Data and Complementary Indicators. Ital Econ J (2024). https://doi.org/10.1007/s40797-024-00266-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s40797-024-00266-6