Using an Information Quality Framework to Evaluate the Quality of Product Reviews

  • You-De Tseng
  • Chien Chin Chen
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5839)


The prevalence of Web2.0 makes the Web an invaluable source of information. For instance, product reviews composed collaboratively by many independent Internet reviewers can help consumers make purchase decisions and enable manufactures to improve their business strategies. As the number of reviews is increasing exponentially, opinion mining is needed to identify important reviews and opinions for users. Most opinion mining approaches try to extract sentimental or bipolar expressions from a large volume of reviews. However, the mining process often ignores the quality of each review and may retrieve useless or even noisy reviews. In this paper, we propose a method for evaluating the quality of information in product reviews. We treat review quality evaluation as a classification problem and employ an effective information quality framework to extract representative review features. Experiments based on an expert-composed data corpus demonstrate that the proposed method outperforms state-of-the-art approaches significantly.


Text Mining Classification Opinion Mining 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Chevalier, J.A., Mayzlin, D.: The Effect of Word of Mouth on Sales: Online Book Reviews. Journal of Marketing Research 43(3), 345–354 (2006)CrossRefGoogle Scholar
  2. 2.
    Dave, K., Lawrence, S., Pennock, D.M.: Mining the Peanut Gallery: Opinion Extraction and Semantic Classification of Product Reviews. In: WWW, pp. 519–528 (2003)Google Scholar
  3. 3.
    Ding, X., Liu, B., Yu, P.S.: A Holistic Lexicon-Based Approach to Opinion Mining. In: WSDM, pp. 231–240 (2008)Google Scholar
  4. 4.
    Eppler, M.J., Wittig, D.: Conceptualizing Information Quality: A Review of Information Quality Frameworks from the Last Ten Years. In: ICIQ, pp. 83–96 (2000)Google Scholar
  5. 5.
    Fellbaum, C.: WordNet: an Electronic Lexical Database. MIT Press, Cambridge (1998)zbMATHGoogle Scholar
  6. 6.
    Hsu, C.W., Lin, C.J.: A Comparison of Methods for Multiclass Support Vector Machines. IEEE Transactions on Neural Networks 13(2), 415–425 (2002)CrossRefGoogle Scholar
  7. 7.
    Hu, M., Liu, B.: Mining and Summarizing Customer Reviews. In: SIGKDD, pp. 168–177 (2004)Google Scholar
  8. 8.
    Huang, K.T., Lee, Y.W., Wang, R.Y.: Quality Information and Knowledge. Prentice Hall PTR, Upper Saddle River (1998)Google Scholar
  9. 9.
    Jindal, N., Liu, B.: Opinion Spam and Analysis. In: WSDM, pp. 219–230 (2008)Google Scholar
  10. 10.
    Joachims, T.: Making Large-scale SVM Learning Practical. In: Schökopf, B., Burges, C., Smola, A. (eds.) Advances in Kernel Methods - Support Vector Learning. MIT-Press, Cambridge (1999)Google Scholar
  11. 11.
    Kim, S.M., Hovy, E.: Determining the Sentiment of Opinions. In: ICCL, pp. 1367–1373 (2004)Google Scholar
  12. 12.
    Kim, S.M., Pantel, P., Chklovski, T., Pennacchiotti, M.: Automatically Assessing Review Helpfulness. In: EMNLP, pp. 423–430 (2006)Google Scholar
  13. 13.
    Ku, L.W., Liang, Y.T., Chen, H.H.: Opinion Extraction, Summarization and Tracking in News and Blog Corpora. In: AAAI-CAAW, Technical Report SS-06-03, pp. 100–107 (2006)Google Scholar
  14. 14.
    Liu, J., Cao, Y., Lin, C.Y., Huang, Y., Zhou, M.: Low-Quality Product Review Detection in Opinion Summarization. In: EMNLP-CoNLL, pp. 334–342 (2007)Google Scholar
  15. 15.
    Manning, C., Raghavan, P., Schütze, H.: An Introduction to Information Retrieval. Cambridge University Press, Cambridge (2008)CrossRefzbMATHGoogle Scholar
  16. 16.
    Miller, G., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.: Introduction to WordNet: An On-line Lexical Database. International Journal of Lexicography 3(4), 235–244 (1990)CrossRefGoogle Scholar
  17. 17.
    Pang, B., Lee, L., Vaithyanathan, S.: Thumbs up? Sentiment Classification using Machine Learning Techniques. In: EMNLP, pp. 79–86 (2002)Google Scholar
  18. 18.
    Roed, J.: Language Learner Behavior in a Virtual Environment. Computer Assisted Language Learning 16(2–3), 155–172 (2003)CrossRefGoogle Scholar
  19. 19.
    Steinwart, I., Christmann, A.: Support Vector Machines. Springer, New York (2008)zbMATHGoogle Scholar
  20. 20.
    Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large Margin Methods for Structured and Interdependent Output Variables. Journal of Machine Learning Research 6, 1453–1484 (2005)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Turney, P.D.: Thumbs Up or Thumbs Down? Semantic Orientation Applied to Unsupervised Classification of Reviews. In: ACL, pp. 129–159 (2002)Google Scholar
  22. 22.
    Wang, R.Y., Strong, D.M.: Beyond Accuracy: What Data Quality Means to Data Consumers. Journal of Management Information Systems 12(4), 5–33 (1996)CrossRefGoogle Scholar
  23. 23.
    Weston. J., Watkins. C.: Multi-class Support Vector Machines. Technical Report CSD-TR-98-04, Royal Holloway, University of London, Department of Computer Science (1998)Google Scholar
  24. 24.
    Zhang, Z., Varadarajan, B.: Utility Scoring of Product Reviews. In: CIKM, pp. 51–57 (2006)Google Scholar
  25. 25.
    Zhu, X., Gauch, S.: Incorporating Quality Metrics in Centralized/Distributed Information Retrieval on the World Wide Web. In: SIGIR, pp. 288–295 (2000)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • You-De Tseng
    • 1
  • Chien Chin Chen
    • 1
  1. 1.Department of Information ManagementNational Taiwan UniversityTaipeiTaiwan (R.O.C.)

Personalised recommendations