Skip to main content
Log in

Analysis of the prediction capability of web search data based on the HE-TDC method ‒ prediction of the volume of daily tourism visitors

  • Published:
Journal of Systems Science and Systems Engineering Aims and scope Submit manuscript

Abstract

Web search query data are obtained to reflect social spots and serve as novel economic indicators. When faced with high-dimensional query data, selecting keywords that have plausible predictive ability and can reduce dimensionality is critical. This paper presents a new integrative method that combines Hurst Exponent (HE) and Time Difference Correlation (TDC) analysis to select keywords with powerful predictive ability. The method is called the HE-TDC screening method and requires keywords with predictive ability to satisfy two characteristics, namely, high correlation and fluctuation memorability similar to the predicting target series. An empirical study is employed to predict the volume of tourism visitors in the Jiuzhai Valley scenic area. The study shows that keywords selected using HE-TDC method produce a model with better robustness and predictive ability.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  1. Bangwayo-Skeete, P. F. & Skeete, R. W. (2015). Can Google data improve the forecasting performance of tourist arrivals? mixed-data sampling approach. Tourism Management, 46: 454–464.

    Article  Google Scholar 

  2. Brynjolfsson, E., Geva, T. & Reichman, S. (2015). Crowd-squared: amplifying the predictive power of search trend data. MIS Quarterly (Forthcoming). Available at http://papers.ssrn.com/sol3/papers.cfm?abstract_id=2513559. Cited January 18, 2016.

    Google Scholar 

  3. CNNIC. (2014). Statistical Report on the Development of China Internet Network in the Thirty-Fifth Time. China Internet Network Information Center. Available at http://www.cac.gov.cn/cnnic35fzzktjbg.htm. Cited March 1st, 2015.

    Google Scholar 

  4. D. Butler. (2013). When Google got flu wrong. Nature, 494(7436): 155.

    Article  Google Scholar 

  5. Du J., Xu H. & Huang X. (2014). Box office prediction based on microblog. Expert Systems with Applications, 41(4): 1680–1689.

    Article  Google Scholar 

  6. Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S. & Brilliant, L. (2009). Detecting influenza epidemics using search engine query data. Nature, 457(7232): 1012–1014.

    Article  Google Scholar 

  7. Lazer, D., Kennedy, R., King, G. & Vespignani, A. (2014). Big data. The parable of Google flu: traps in big data analysis. Science (NY), 343(6176): 1203.

    Google Scholar 

  8. Liu, Y., Chen, Y., Wu, S., Peng, G. & Lv, B. (2015). Composite leading search index: a preprocessing method of internet search data for stock trends prediction. Annals of Operations Research, 234(1): 77–94.

    Article  MathSciNet  MATH  Google Scholar 

  9. Peng, G. & Wang, J.Y. (2014). Detecting syphilis amount in China based on Baidu query data. In: International Conference on Soft Computing in Information Communication Technology (SCICT 2014), Atlantis Press.

    Google Scholar 

  10. Preis, T., Moat, H.S. & Stanley, H.E. (2013). Quantifying trading behavior in financial markets using google trends. Scientific Reports, 3:1684. doi:10.1038/srep01684

    Article  Google Scholar 

  11. Scott, S. L. & Varian, H. R. (2013). Bayesian variable selection for nowcasting economic time series. National Bureau of Economic Research. Available via http://www.nber.org/papers/w19567.pdf. Cited January 18, 2016.

    Book  Google Scholar 

  12. Vaughan, L. & Romero-Frías, E. (2014). Web search volume as a predictor of academic fame: an exploration of Google Trends. Journal of the Association for Information Science and Technology, 65(4): 707–720.

    Article  Google Scholar 

  13. Wang, J.Y., Peng, G. & Dai, W. (2014). Prediction of online trade growth using search-ANFIS: transactions on Taobao as examples. In: 2014 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), July 6-11, 2014, Beijing, China.

    Google Scholar 

  14. Wu, L. & Brynjolfsson, E. (2014). The future of prediction: how Google searches foreshadow housing prices and sales. Economics of Digitization, University of Chicago Press.

    Google Scholar 

  15. Yang, X., Pan, B., Evans, J. A. & Lv, B. (2015). Forecasting Chinese tourist volume with search engine data. Tourism Management, 46: 386–397.

    Article  Google Scholar 

  16. Yang, Y., Pan, B. & Song, H. (2014). Predicting hotel demand using destination marketing organization’s WEB traffic data. Journal of Travel Research, 53(4): 433–447.

    Article  Google Scholar 

  17. Yuan, Q., Nsoesie, E. O., Lv, B., Peng, G., Chunara, R. & Brownstein, J. S. (2013). Monitoring influenza epidemics in china with search query from Baidu. PloS one, 8(5): e64323.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ying Liu.

Additional information

Geng Peng received BS from North China Electric Power University in 1992, MS degree from Wuhan University of Science and Technology in 1998 and PhD degree from Tianjin University in 2001. Now he is an associate professor of School of Economic and Management, University of Chinese Academy of Sciences (UCAS). He is having more than 15 years of teaching experience and current area of research includes e-commerce and Internet data analysis.

Ying Liu received BS from Jilin University in 2006, MS and PhD degree from University of Chinese Academy of Sciences respectively in 2008 and 2011. Now he is an associate professor of School of Economic and Management, UCAS. His research interests focus on e-commerce, Internet economy and Internet data analysis.

Jiyuan Wang received BS in Nanjing University of Aeronautics and Astronautics in 2012. Now he is a PhD student of UCAS and University of Groningen in Netherland. His main research interests focus on econometrics.

Jifa Gu received his BS from Peking University and PhD from Institute of Mathematics, USSR Academy. He is a professor of Institute of Systems Science, Chinese Academy of Sciences (CAS). He is academician of International Academy of Systems Science and Cybernetics. His main interests are operations research, systems engineering and systems science.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Peng, G., Liu, Y., Wang, J. et al. Analysis of the prediction capability of web search data based on the HE-TDC method ‒ prediction of the volume of daily tourism visitors. J. Syst. Sci. Syst. Eng. 26, 163–182 (2017). https://doi.org/10.1007/s11518-016-5311-7

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11518-016-5311-7

Keywords

Navigation