Journal of Intelligent Information Systems

, Volume 44, Issue 1, pp 1–47 | Cite as

Algorithms and criteria for diversification of news article comments

  • Giorgos GiannopoulosEmail author
  • Marios Koniaris
  • Ingmar Weber
  • Alejandro Jaimes
  • Timos Sellis


In this paper, we introduce an approach for diversifying user comments on news articles. We claim that, although content diversity suffices for the keyword search setting, as proven by existing work on search result diversification, it is not enough when it comes to diversifying comments of news articles. Thus, in our proposed framework, we define comment-specific diversification criteria in order to extract the respective diversification dimensions in the form of feature vectors. These criteria involve content similarity, sentiment expressed within comments, named entities, quality of comments and combinations of them. Then, we apply diversification on comments, utilizing the extracted features vectors. The outcome of this process is a subset of the initial set that contains heterogeneous comments, representing different aspects of the news article, different sentiments expressed, different writing quality, etc. We perform an experimental analysis showing that the diversity criteria we introduce result in distinctively diverse subsets of comments, as opposed to the baseline of diversifying comments only w.r.t. to their content. We also present a prototype system that implements our diversification framework on news articles comments.


News Article Positive Sentiment Candidate Comment Diversification Process Sentiment Class 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.



This research is conducted as part of the EU project ARCOMEM10 FP7-ICT- 270239.


  1. Agrawal, R., Gollapudi, S., Halverson, A., Ieong, S. (2009). Diversifying search results. In Proceedings of the second international conference on web search and web data mining (WSDM 2009) (pp.5-14).Google Scholar
  2. Carbonell, J., & Goldstein, J. (1998). The use of MMR, diversity-based reranking for reordering documents and producing summaries. In: Proceedings of the 21st annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’98)(pp.335-336).Google Scholar
  3. Chandra, B., & Halldórsson, M. M. (2001). Approximation algorithms for dispersion problems. Journal of Algorithms, 38(2), 438–465.CrossRefzbMATHMathSciNetGoogle Scholar
  4. Chen, H., & Karger, D. R. (2006). Less is more: Probabilistic models for retrieving fewer relevant documents. In: Proceedings of the 29th annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’06)(pp. 429-436).Google Scholar
  5. Clarke, C. L. A., Kolla, M., Cormack, G. V., Vechtomova, O., Ashkan, A., Büttcher, S., MacKinnon, I. (2008). Novelty and diversity in information retrieval evaluation.In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’08)(pp. 659–666).Google Scholar
  6. Diakopoulos, N., & Naaman, M. (2011). Towards quality discourse in online news comments. In Proceedings of the ACM 2011 conference on Computer supported cooperative work (CSCW ’11)(pp. 133–142).Google Scholar
  7. Drosou, M., & Pitoura, E. (2010). Search result diversification. ACM SIGMOD record, 39(1), 41–47.Google Scholar
  8. Erkut, E. (1990). The discrete p-dispersion problem. Operations Research Letters, 46(1), 48–60.zbMATHMathSciNetGoogle Scholar
  9. Erkut, E., Ülküsal, Y., Yeniçerioglu, O. (1994). A comparison of p-dispersion heuristics. Computers Operations Research, 21(10), 1103–1113.CrossRefzbMATHGoogle Scholar
  10. Finkel, J. R., Grenager, T., Manning, C. (2005). Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling. In: Proceedings of the 43nd annual meeting of the association for computational linguistics (ACL ’05)(pp. 363–370).Google Scholar
  11. Giannopoulos, G., Weber, I., Jaimes, A., Sellis, T. (2012). Diversifying User Comments on News Articles. In: Proceedings of the 13th international conference web information systems engineering (WISE ’12)(pp. 100–113).Google Scholar
  12. Gollapudi, S., & Sharma, A. (2009). An axiomatic approach for result diversification. In: Proceedings of the 18th international conference on World wide web (WWW ’09)(pp. 381–390).Google Scholar
  13. Hassin, R., Rubinstein, S., Tamir, A. (1997). Approximation algorithms for maximum dispersion. Operations Research Letters, 21(3), 133–137.CrossRefzbMATHMathSciNetGoogle Scholar
  14. Herring, S. C., Kouper, I., Paolillo, J. C., Scheidt, L. A., Tyworth, M., Welsch, P., Wright, E., Ning, Y. (2005). Conversations in the blogosphere: an analysis “from the bottom up”. In: Proceedings of the 38th annual hawaii international conference on system sciences, (HICSS ’05)(pp. 107b–107b).Google Scholar
  15. Hu, M., Sun, A., Lim, E. (2008). Comments-oriented document summarization: Understanding documents with readers’ feedback. In: Proceedings of the 31st annual international ACM SIGIR conference on research and development in information retrieval (SIGIR ’08)(pp. 291–298).Google Scholar
  16. Kucuktunc, O., Cambazoglu, B. B., Weber, I., Ferhatosmanoglu, H. (2012). A large-scale sentiment analysis for Yahoo! answers. In: Proceedings of the 5th ACM international conference on Web search and data mining (WSDM’12)(pp. 633–642).Google Scholar
  17. Mishne, G. A., & Glance, N. (2006). Leave a Reply: An analysis of weblog comments. In: Proceedings of the WWW 2006 workshop on weblogging ecosystem: aggregation, analysis and dynamics, at WWW ’: the 15th international conference on world wide web.Google Scholar
  18. Munson, S. A., & Resnick, P. (2010). Presenting diverse political opinions: How and how much. In: Proceedings of the 28th international conference on Human factors in computing systems (CHI ’10)(pp. 1457–1466).Google Scholar
  19. Park, S., Ko, M., Kim, J., Liu, Y., Song, J. (2011). The politics of comments: predicting political orientation of news stories with commenters sentiment patterns. In: Proceedings of the ACM 2011 conference on computer supported cooperative work (CSCW ’11)(pp. 113–122).Google Scholar
  20. Potthast, M. (2009). Measuring the descriptiveness of web comments. In: Proceedings of the 32nd international ACM SIGIR conference on research and development (SIGIR ’09)(pp. 724–725).Google Scholar
  21. Ravi, S. S., Rosenkrantzt, D. J., Tayi, G. K. (2007). Approximation algorithms for facility dispersion In Gonzalez, T. F. (Ed.), Handbook of Approximation algorithms and metaheuristics: Chapman & Hall/CRC.Google Scholar
  22. Shmueli, E., Kagian, A., Koren, Y., Lempel, R. (2012). Care to Comment? Recommendations for Commenting on News Stories. In: Proceedings of the 18th international conference on World wide web WWW ’12, to appear.Google Scholar
  23. Thelwall, M., Buckley, K., Paltoglou, G., Cai, D., Kappas, A. (2010). Sentiment strength detection in short informal text. Journal of the American Society for Information Science and Technology, 61(12), 2544–2558.CrossRefGoogle Scholar
  24. Tsagkias, E., Weerkamp, W., de Rijke, M (2009). Predicting the volume of comments on online news stories. In: Proceedings of the 18th ACM conference on Information and knowledge management (CIKM ’09)(pp.1765–1768).Google Scholar
  25. Tsagkias, E., Weerkamp, W., de Rijke, M. (2010). News Comments: exploring, modeling, and online predicting. In: Proceedings of the 32nd european conference on information retrieval (ECIR ’10)(pp. 109–203).Google Scholar
  26. Vallet, D., & Castells, P. (2012). Personalized diversification of search results. In: Proceedings of the 35th international ACM SIGIR conference on research and development in information retrieval (SIGIR ’12)(pp. 841–850).Google Scholar
  27. Vee, E., Srivastava, U., Shanmugasundaram, J., Bhat, P., Yahia, S. A. (2008). Efficient computation of diverse query results. In: Proceedings of the 2008 IEEE 24th international conference on data engineering (ICDE ’08)(pp. 228–236).Google Scholar
  28. Li, Q., Wang, J., Chen, Y. P., Lin, Z. (2010). User comments for news recommendation in forum-based social media. Information Sciences: An International Journal, 180(24), 4929–4939.CrossRefGoogle Scholar
  29. Wong, D., Faridani, S., Bitton, E., Hartmann, B., Goldberg, K. (2011). The diversity donut: enabling participant control over the diversity of recommended responses. In: Proceedings of the 2011 annual conference extended abstracts on human factors in computing systems (CHI EA ’11)(pp. 1471–1476).Google Scholar

Copyright information

© Springer Science+Business Media New York 2014

Authors and Affiliations

  • Giorgos Giannopoulos
    • 1
    Email author
  • Marios Koniaris
    • 2
  • Ingmar Weber
    • 3
  • Alejandro Jaimes
    • 4
  • Timos Sellis
    • 5
  1. 1.IMIS Institute, “Athena” Research CenterAthensGreece
  2. 2.School of ECENational Technical University of AthensAthensGreece
  3. 3.Qatar Computing Research InstituteDohaQatar
  4. 4.Yahoo! ResearchBarcelonaSpain
  5. 5.School of CSITRMIT UniversityMelbourneAustralia

Personalised recommendations