Multi-facet Rating of Product Reviews

  • Stefano Baccianella
  • Andrea Esuli
  • Fabrizio Sebastiani
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5478)

Abstract

Online product reviews are becoming increasingly available, and are being used more and more frequently by consumers in order to choose among competing products. Tools that rank competing products in terms of the satisfaction of consumers that have purchased the product before, are thus also becoming popular. We tackle the problem of rating (i.e., attributing a numerical score of satisfaction to) consumer reviews based on their textual content. We here focus on multi-facet review rating, i.e., on the case in which the review of a product (e.g., a hotel) must be rated several times, according to several aspects of the product (for a hotel: cleanliness, centrality of location, etc.). We explore several aspects of the problem, with special emphasis on how to generate vectorial representations of the text by means of POS tagging, sentiment analysis, and feature selection for ordinal regression learning. We present the results of experiments conducted on a dataset of more than 15,000 reviews that we have crawled from a popular hotel review site.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Gretzel, U., Yoo, K.Y.: Use and impact of online travel review. In: Proceedings of the 2008 International Conference on Information and Communication Technologies in Tourism, Innsbruck, AT, pp. 35–46 (2008)Google Scholar
  2. 2.
    Jindal, N., Liu, B.: Review spam detection. In: Proceedings of the 16th International Conference on the World Wide Web (WWW 2007), Banff, CA, pp. 1189–1190 (2007)Google Scholar
  3. 3.
    Pang, B., Lee, L.: Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1/2), 1–135 (2008)CrossRefGoogle Scholar
  4. 4.
    Schölkopf, B., Smola, A.J., Williamson, R.C., Bartlett, P.L.: New support vector algorithms. Neural Computation 12(5), 1207–1245 (2000)CrossRefGoogle Scholar
  5. 5.
    Chang, C.C., Lin, C.J.: LIBSVM: a library for support vector machines (2001), http://www.csie.ntu.edu.tw/~cjlin/libsvm
  6. 6.
    Caropreso, M.F., Matwin, S., Sebastiani, F.: A learner-independent evaluation of the usefulness of statistical phrases for automated text categorization. In: Chin, A.G. (ed.) Text Databases and Document Management: Theory and Practice, pp. 78–102. Idea Group Publishing, Hershey (2001)Google Scholar
  7. 7.
    Stone, P.J., Dunphy, D.C., Smith, M.S., Ogilvie, D.M.: The General Inquirer: A Computer Approach to Content Analysis. MIT Press, Cambridge (1966)Google Scholar
  8. 8.
    John, G.H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection problem. In: Proceedings of the 11th International Conference on Machine Learning (ICML 1994), New Brunswick, US, pp. 121–129 (1994)Google Scholar
  9. 9.
    Yang, Y., Pedersen, J.O.: A comparative study on feature selection in text categorization. In: Proceedings of the 14th International Conference on Machine Learning (ICML 1997), Nashville, US, pp. 412–420 (1997)Google Scholar
  10. 10.
    Miller, A.: Subset selection in regression, 2nd edn. Chapman and Hall, London (2002)CrossRefMATHGoogle Scholar
  11. 11.
    Geng, X., Liu, T.Y., Qin, T., Li, H.: Feature selection for ranking. In: Proceedings of the 30th ACM International Conference on Research and Development in Information Retrieval (SIGIR 2007), Amsterdam, NL, pp. 407–414 (2007)Google Scholar
  12. 12.
    Forman, G.: A pitfall and solution in multi-class feature selection for text classification. In: Proceedings of the 21st International Conference on Machine Learning (ICML 2004), Banff, CA, pp. 38–45 (2004)Google Scholar
  13. 13.
    Cavnar, W.B., Trenkle, J.M.: N-gram-based text categorization. In: Proceedings of the 3rd Annual Symposium on Document Analysis and Information Retrieval (SDAIR 1994), Las Vegas, US, pp. 161–175 (1994)Google Scholar
  14. 14.
    Dave, K., Lawrence, S., Pennock, D.M.: Mining the peanut gallery: Opinion extraction and semantic classification of product reviews. In: Proceedings of the 12th International Conference on the World Wide Web (WWW 2003), Budapest, HU, pp. 519–528 (2003)Google Scholar
  15. 15.
    Goldberg, A.B., Zhu, X.: Seeing stars when there aren’t many stars: Graph-based semi-supervised learning for sentiment categorization. In: Proceedings of the HLT/NAACL Workshop on Graph-based Algorithms for Natural Language Processing, New York, US (2006)Google Scholar
  16. 16.
    Pang, B., Lee, L.: Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In: Proceedings of the 43rd Meeting of the Association for Computational Linguistics (ACL 2005), Ann Arbor, US, pp. 115–124 (2005)Google Scholar
  17. 17.
    Pekar, V., Ou, S.: Discovery of subjective evaluations of product features in hotel reviews. Journal of Vacation Marketing 14(2), 145–156 (2008)CrossRefGoogle Scholar
  18. 18.
    Popescu, A.M., Etzioni, O.: Extracting product features and opinions from reviews. In: Proceedings of the Human Language Technology Conference and Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP 2005), Vancouver, CA, pp. 339–346 (2005)Google Scholar
  19. 19.
    Shimada, K., Endo, T.: Seeing several stars: A rating inference task for a document containing several evaluation criteria. In: Washio, T., Suzuki, E., Ting, K.M., Inokuchi, A. (eds.) PAKDD 2008. LNCS (LNAI), vol. 5012, pp. 1006–1014. Springer, Heidelberg (2008)CrossRefGoogle Scholar
  20. 20.
    Snyder, B., Barzilay, R.: Multiple aspect ranking using the good grief algorithm. In: Proceedings of the Joint Conference of the North American Chapter of the Association for Computational Linguistics and Human Language Technology Conference (NAACL/HLT 2007), Rochester, US, pp. 300–307 (2007)Google Scholar
  21. 21.
    Zhang, Z., Varadarajan, B.: Utility scoring of product reviews. In: Proceedings of the 15th ACM International Conference on Information and Knowledge Management (CIKM 2006), Arlington, US, pp. 51–57 (2006)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Stefano Baccianella
    • 1
  • Andrea Esuli
    • 1
  • Fabrizio Sebastiani
    • 1
  1. 1.Istituto di Scienza e Tecnologia dell’InformazioneConsiglio Nazionale delle RicerchePisaItaly

Personalised recommendations