International Journal on Digital Libraries

, Volume 16, Issue 2, pp 161–179 | Cite as

Sifting useful comments from Flickr Commons and YouTube

  • Elaheh MomeniEmail author
  • Bernhard Haslhofer
  • Ke Tao
  • Geert-Jan Houben


Cultural institutions are increasingly contributing content to social media platforms to raise awareness and promote use of their collections. Furthermore, they are often the recipients of user comments containing information that may be incorporated in their catalog records. However, not all user-generated comments can be used for the purpose of enriching metadata records. Judging the usefulness of a large number of user comments is a labor-intensive task. Accordingly, our aim was to provide automated support for curation of potentially useful social media comments on digital objects. In this paper, the notion of usefulness is examined in the context of social media comments and compared from the perspective of both end-users and expert users. A machine-learning approach is then introduced to automatically classify comments according to their usefulness. This approach uses syntactic and semantic comment features while taking user context into consideration. We present the results of an experiment we conducted on user comments collected from Flickr Commons collections and YouTube. A study is then carried out on the correlation between the commenting culture of a platform (YouTube and Flickr) with usefulness prediction. Our findings indicate that a few relatively straightforward features can be used for inferring useful comments. However, the influence of features on usefulness classification may vary according to the commenting cultures of platforms.


User-generated comment Social media Usefulness  Prediction YouTube Flickr 



This work was supported in part by a Marie Curie International Outgoing Fellowship within the 7th European Community Framework Program (PIOF-GA-2009-252206). We also thank members of the Library of Congress and especially Helena Zinkham for their insightful comments and advice on the result of the LOC project on Flickr Commons and the anonymous reviewers for feedback and suggestions.


  1. 1.
    Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding high-quality content in social media with an application to community-based question answering. In: Proceedings of WSDM (2008)Google Scholar
  2. 2.
    Becker, H., Iter, D., Naaman, M., Gravano, L.: Identifying content for planned events across social media sites. In: Proceedings of the fifth ACM international conference on Web search and data mining, WSDM ’12. ACM (2012)Google Scholar
  3. 3.
    Momeni, E., Tao, K., Haslhofer, B., Houben, G.: Identification of Useful User Comments in Social Media: A Case Study on Flickr Commons. In: Proceedings of the 13th ACM/IEEE Joint Conference on Digital Libraries, JCDL ’13. ACM (2013)Google Scholar
  4. 4.
    Momeni, E., Cardie, C., Ott, M.: Properties, prediction, and prevalence of useful user-generated comments for descriptive annotation of social media objects. In: Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media, ICWSM ’13. AAAI (2013)Google Scholar
  5. 5.
    Blei, D.M., Ng, A., Jordan, M.: Latent dirichlet allocation. J. Mach. Learn. Res.(JMLR) 3:993–1022 (2003)Google Scholar
  6. 6.
    Castillo, C., Mendoza, M., Poblete, B.: Information credibility on twitter. In the 20th international conference, WWW (2011)Google Scholar
  7. 7.
    Danescu-Niculescu-Mizil, C., Kossinets, G., Kleinberg, J., Lee, L.: How opinions are received by online communities: a case study on helpfulness votes. In: Proceedings of the 18th international conference on World wide web, WWW ’09 (2009)Google Scholar
  8. 8.
    Diakopoulos, N., De Choudhury, M., Naaman, M.: Finding and assessing social media information sources in the context of journalism. In: Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems, CHI ’12. ACM (2012)Google Scholar
  9. 9.
    Ghose, A., Ipeirotis, P.G.: Designing novel review ranking systems: predicting the usefulness and impact of reviews. In ICEC ’07: Proceedings of the ninth international conference on Electronic commerce (2007)Google Scholar
  10. 10.
    Gunning, R.: The Technique of Clear Writing. McGraw-Hill, New York (1952)Google Scholar
  11. 11.
    Hall, C.E., Zarro, M.A.: What do you call it?: a comparison of library-created and user-created tags. In: Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries, JCDL ’11. ACM (2011)Google Scholar
  12. 12.
    Halpin, H., Robu, V., Shepherd, H.: The complex dynamics of collaborative tagging. In Proceedings of the 16th international conference on World Wide Web, WWW ’07 (2007)Google Scholar
  13. 13.
    Harper, F.M., Moy, D., Konstan, J.A.: Facts or friends?: distinguishing informational and conversational questions in social q&a sites. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM (2009)Google Scholar
  14. 14.
    Haslhofer, B., Robitza, W., Lagoze, C., Guimbretiere, F.: Semantic tagging on historical maps. In ACM Web Science 2013, Paris, France, ACM (2013) Google Scholar
  15. 15.
    Kammerer, Y., Nairn, R., Pirolli, P., Chi, E.H.: Signpost from the masses: learning effects in an exploratory social tag search browser. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI ’09, pp. 625–634. New York , NY, USA, ACM (2009)Google Scholar
  16. 16.
    Kim, S.-M., Pantel, P., Chklovski, T., Pennacchiotti, M.: Automatically assessing review helpfulness. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP ’06 (2006)Google Scholar
  17. 17.
    Liu, J., Cao, Y., Lin, C.Y., Huang, Y., Zhou, M.: Low-quality product review detection in opinion summarization. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)Google Scholar
  18. 18.
    Liu, Y., Bian, J., Agichtein, E.: Predicting information seeker satisfaction in community question answering. In: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM (2008)Google Scholar
  19. 19.
    Lu, Y., Tsaparas, P., Ntoulas, A., Polanyi, L.: Exploiting social context for review quality prediction. In: Proceedings of the 19th international conference on World wide web, WWW ’10 (2010)Google Scholar
  20. 20.
    Seki, K., Qin, H., Uehara, K.: Impact and prospect of social bookmarks for bibliographic information retrieval. In: Proceedings of the 10th annual joint conference on Digital libraries, JCDL ’10 (2010)Google Scholar
  21. 21.
    Sigurbjörnsson, B., van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th international conference on World Wide Web, WWW ’08. ACM (2008)Google Scholar
  22. 22.
    Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of Words: LIWC and computerized text analysis methods. J. Lang. Soc. Psych. 29(1):24–54 (2010)Google Scholar
  23. 23.
    Wagner, C., Rowe, M., Strohmaier, M., Alani, H.: What catches your attention? an empirical study of attention patterns in community forums. In ICWSM (2012)Google Scholar
  24. 24.
    Weinberger, K.Q., Slaney, M., Van Zwol, R.: Resolving tag ambiguity. In: Proceedings of the 16th ACM international conference on Multimedia, MM ’08. ACM (2008)Google Scholar
  25. 25.
    Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05, pages 347–354, Stroudsburg, PA, USA. Association for Computational Linguistics (2005)Google Scholar
  26. 26.
    Siersdorfer, S., Chelaru, S., Nejdl, W., San Pedro, J.: How useful are your comments?: analyzing and predicting youtube comments and comment ratings. In: Proceedings of the 19th international conference on World wide web, WWW ’10. ACM (2010)Google Scholar
  27. 27.
    Hsu, C.-F., Khabiri, E., Caverlee, J.: Ranking comments on the social web. In Proceedings of the 2009 International Conference on Computational Science and Engineering—Volume 04, CSE ’09, 90–97. Washington, DC, USA: IEEE Computer Society (2009)Google Scholar
  28. 28.
    Abel, F., Celik, I., Houben, G.-J., Siehndel, P.: Leveraging the semantics of tweets for adaptive faceted search on twitter. In: Proceedings of the 10th International Conference on The Semantic Web, ISWC’11, pages 1–17. Springer-Verlag (2011)Google Scholar
  29. 29.
    Lampe, C., Resnick, P.: Slash(dot) and burn: distributed moderation in a large online conversation space. In: Proceedings of the SIGCHI conference on Human factors in computing systems, CHI ’04 (2004)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2014

Authors and Affiliations

  • Elaheh Momeni
    • 1
    Email author
  • Bernhard Haslhofer
    • 2
  • Ke Tao
    • 3
  • Geert-Jan Houben
    • 3
  1. 1.Faculty of Computer ScienceUniversity of ViennaViennaAustria
  2. 2.Austrian Institute of TechnologyViennaAustria
  3. 3.Department of Software and Computer TechnologyDelft University of TechnologyDelftThe Netherlands

Personalised recommendations