Analysing User Reviews in Tourism with Topic Models

  • Marco RossettiEmail author
  • Fabio Stella
  • Longbing Cao
  • Markus Zanker
Conference paper


User generated content in general and textual reviews in particular constitute a vast source of information for the decision making of tourists and management and are therefore a key component for e-tourism. This paper explores different application scenarios for the topic model method to process these textual reviews in order to provide accurate decision support and recommendations as well as to build a basis for further analytics. Besides contributing a new model based on the topic model method, this paper also includes empirical evidence from experiments on user reviews from the YELP dataset and from TripAdvisor.


Web 2.0 Customer reviews Classification 



The first author wishes to acknowledge the financial support provided by the Australian Government Department of Education through the 2014 Endeavour Research Fellowship awarded for the visiting period at the Advanced Analytics Institute, University of Technology, Sydney, Australia, under the supervision of Prof. Longbing Cao.

Furthermore, authors acknowledge the financial support from the European Union (EU), the European Regional Development Fund (ERDF), the Austrian Federal Government and the State of Carinthia in the Interreg IV Italien-Österreich programme (project acronym O-STAR).


  1. Agarwal, D., & Chen, B. C. (2010). fLDA: Matrix factorization through latent dirichlet allocation. In Proceedings of the Third ACM International Conference on Web Search and Data Mining (pp. 91–100). ACM.Google Scholar
  2. Blei, D. M. (2012). Probabilistic topic models. Communications of the ACM, 55(4), 77–84.CrossRefGoogle Scholar
  3. Blei, D. M., & Lafferty, J. D. (2006a). Correlated topic models. Advances in Neural Information Processing Systems, 18, 147.Google Scholar
  4. Blei, D. M., & Lafferty, J. D. (2006b). Dynamic topic models. In Proceedings of the 23rd International Conference on Machine Learning (pp. 113–120). ACM.Google Scholar
  5. Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent dirichlet allocation. The Journal of Machine Learning Research, 3, 993–1022.Google Scholar
  6. Chang, J., & Blei, D. M. (2010). Hierarchical relational models for document networks. The Annals of Applied Statistics, 4(1), 124–150.CrossRefGoogle Scholar
  7. Deerwester, S. C., Dumais, S. T., Landauer, T. K., Furnas, G. W., & Harshman, R. A. (1990). Indexing by latent semantic analysis. JASIS, 41(6), 391–407.CrossRefGoogle Scholar
  8. Dippelreiter, B., Grün, C., Pöttler, M., Seidel, I., Berger, H., Dittenbach, M., et al. (2007). Online tourism communities on the path to Web 2.0—An evaluation, virtual communities in travel and tourism. Information Technology and Tourism, 10(4), 329–353.CrossRefGoogle Scholar
  9. Herlocker, J. L., Konstan, J. A., Borchers, A., & Riedl, J. (1999). An algorithmic framework for performing collaborative filtering. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 230–237). ACM.Google Scholar
  10. Hofmann, T. (1999). Probabilistic latent semantic indexing. In Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 50–57). ACM.Google Scholar
  11. Jannach, D., Zanker, M., & Fuchs, M. (2014). Leveraging multi-criteria customer feedback for satisfaction analysis and improved recommendations. Information Technology and Tourism, 14(2), 119–149.CrossRefGoogle Scholar
  12. Lin, C., & He, Y. (2009). Joint sentiment/topic model for sentiment analysis. In Proceedings of the 18th ACM Conference on Information and Knowledge Management (pp. 375–384). ACM.Google Scholar
  13. Litvin, S. W., Goldsmith, R. E., & Pan, B. (2008). Electronic word-of-mouth in hospitality and tourism management. Tourism Management, 29(3), 458–468.CrossRefGoogle Scholar
  14. McAuley, J., & Leskovec, J. (2013). Hidden factors and hidden topics: understanding rating dimensions with review text. In Proceedings of the Seventh ACM Conference on Recommender Systems (pp. 165–172). ACM.Google Scholar
  15. Mcauliffe, J. D., & Blei, D. M. (2008). Supervised topic models. In Advances in Neural Information Processing Systems (pp. 121–128). Cambridge, MA: MIT Press.Google Scholar
  16. Mnih, A., & Salakhutdinov, R. (2007). Probabilistic matrix factorization. In Proceedings of the Advances in Neural Information Processing Systems (pp. 1257–1264).Google Scholar
  17. Pang, B., & Lee, L. (2008). Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval, 2(1–2), 1–135.CrossRefGoogle Scholar
  18. Rossetti, M., Stella, F., & Zanker, M. (2013). Towards explaining latent factors with topic models in collaborative recommender systems. In Database and Expert Systems Applications (DEXA), 2013 24th International Workshop on (pp. 162–167). IEEE.Google Scholar
  19. Sarwar, B., Karypis, G., Konstan, J., & Riedl, J. (2001). Item-based collaborative filtering recommendation algorithms. In Proceedings of the Tenth International Conference on World Wide Web (pp. 285–295). ACM.Google Scholar
  20. Schmallegger, D., & Carson, D. (2008). Blogs in tourism: Changing approaches to information exchange. Journal of Vacation Marketing, 14(2), 99–110.CrossRefGoogle Scholar
  21. Wang, C., & Blei, D. M. (2011). Collaborative topic modeling for recommending scientific articles. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 448–456). ACM.Google Scholar
  22. Wang, H., Lu, Y., & Zhai, C. (2010). Latent aspect rating analysis on review text data: A rating regression approach. In Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 783–792). ACM.Google Scholar
  23. Xiang, Z., & Gretzel, U. (2010). Role of social media in online travel information search. Tourism Management, 31(2), 179–188.CrossRefGoogle Scholar
  24. Ye, Q., Law, R., Gu, B., & Chen, W. (2011). The influence of user-generated content on traveler behavior: An empirical investigation on the effects of e-word-of-mouth to hotel online bookings. Computers in Human Behavior, 27(2), 634–639.CrossRefGoogle Scholar
  25. Zehrer, A., Crotts, J. C., & Magnini, V. P. (2011). The perceived usefulness of blog postings: An extension of the expectancy-disconfirmation paradigm. Tourism Management, 32(1), 106–113.CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Marco Rossetti
    • 1
    Email author
  • Fabio Stella
    • 1
  • Longbing Cao
    • 2
  • Markus Zanker
    • 3
  1. 1.Department of Informatics, Systems and CommunicationUniversity of Milano-BicoccaMilanItaly
  2. 2.Advanced Analytics InstituteUniversity of TechnologySydneyAustralia
  3. 3.Department of Applied InformaticsAlpen-Adria-UniversitätKlagenfurtAustria

Personalised recommendations