Skip to main content
Log in

Sifting useful comments from Flickr Commons and YouTube

  • Published:
International Journal on Digital Libraries Aims and scope Submit manuscript

Abstract

Cultural institutions are increasingly contributing content to social media platforms to raise awareness and promote use of their collections. Furthermore, they are often the recipients of user comments containing information that may be incorporated in their catalog records. However, not all user-generated comments can be used for the purpose of enriching metadata records. Judging the usefulness of a large number of user comments is a labor-intensive task. Accordingly, our aim was to provide automated support for curation of potentially useful social media comments on digital objects. In this paper, the notion of usefulness is examined in the context of social media comments and compared from the perspective of both end-users and expert users. A machine-learning approach is then introduced to automatically classify comments according to their usefulness. This approach uses syntactic and semantic comment features while taking user context into consideration. We present the results of an experiment we conducted on user comments collected from Flickr Commons collections and YouTube. A study is then carried out on the correlation between the commenting culture of a platform (YouTube and Flickr) with usefulness prediction. Our findings indicate that a few relatively straightforward features can be used for inferring useful comments. However, the influence of features on usefulness classification may vary according to the commenting cultures of platforms.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1

Similar content being viewed by others

Notes

  1. Library of Congress Flickr Pilot Project Report Summary http://www.loc.gov/rr/print/flickr_report_final_summary

  2. http://www.tate.org.uk/about/our-work/digital/social-media-directory

  3. Source: Library of Congress Flickr Pilot Project Report Summary, http://www.loc.gov/rr/print/flickr_report_final_summary.

  4. “The key goals of The Commons on Flickr are to first show users hidden treasures in the world’s public photography archives, and, second, to show how users’ input and knowledge can help make these collections even richer. Users are invited to help describe the photographs they discover in The Commons on Flickr, either by adding tags or leaving comments.” www.flickr.com/commons

  5. http://www.flickr.com/photos/library_of_congress/2850357813/comment72157607279573241

  6. http://www.flickr.com/photos/library_of_congress/2536790306/comment72157629444651496

  7. http://www.youtube.com/watch?v=Yka3M4uvUyo

  8. http://www.youtube.com/watch?v=d2qamDMs-3g

  9. http://alias-i.com/lingpipe/

  10. Flickr photo - April 15, 1901 http://www.flickr.com/photos/nlireland/6933777014/comment72157629836757055.

  11. Flickr photo - Jimmy Clabby. Boxing http://www.flickr.com/photos/library_of_congress/2163449292/comment72157603820313375

  12. http://gate.ac.uk

  13. Flickr photo—(Clara Runkel) Mrs. Oscar F. Grab http://www.flickr.com/photos/library_of_congress/6851810917/comment72157629260546153

  14. Flickr photo - Paris Exposition: Hungarian Pavilion, Paris, France, 1900 http://www.flickr.com/photos/brooklyn_museum/2486821878/comment72157613666119960

  15. PROHIBITION DOCUMENTARY. http://www.youtube.com/watch?v=OiYqFXmVAFg

  16. We also experimented with a different number of topics (10, 100, and 500) for training the LDA model. However, our results—discussed in the Experiments section—have shown that training the LDA model using 1,000 topics is a most influential setting

  17. YouTube Video – Martin Luther King, Jr. - Mini Bio http://www.youtube.com/watch?v=3ank52Zi_S0

  18. These are user accounts which have the pattern “Name (LOC P&P)” and use the Library of Congress logo

References

  1. Agichtein, E., Castillo, C., Donato, D., Gionis, A., Mishne, G.: Finding high-quality content in social media with an application to community-based question answering. In: Proceedings of WSDM (2008)

  2. Becker, H., Iter, D., Naaman, M., Gravano, L.: Identifying content for planned events across social media sites. In: Proceedings of the fifth ACM international conference on Web search and data mining, WSDM ’12. ACM (2012)

  3. Momeni, E., Tao, K., Haslhofer, B., Houben, G.: Identification of Useful User Comments in Social Media: A Case Study on Flickr Commons. In: Proceedings of the 13th ACM/IEEE Joint Conference on Digital Libraries, JCDL ’13. ACM (2013)

  4. Momeni, E., Cardie, C., Ott, M.: Properties, prediction, and prevalence of useful user-generated comments for descriptive annotation of social media objects. In: Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media, ICWSM ’13. AAAI (2013)

  5. Blei, D.M., Ng, A., Jordan, M.: Latent dirichlet allocation. J. Mach. Learn. Res.(JMLR) 3:993–1022 (2003)

  6. Castillo, C., Mendoza, M., Poblete, B.: Information credibility on twitter. In the 20th international conference, WWW (2011)

  7. Danescu-Niculescu-Mizil, C., Kossinets, G., Kleinberg, J., Lee, L.: How opinions are received by online communities: a case study on amazon.com helpfulness votes. In: Proceedings of the 18th international conference on World wide web, WWW ’09 (2009)

  8. Diakopoulos, N., De Choudhury, M., Naaman, M.: Finding and assessing social media information sources in the context of journalism. In: Proceedings of the 2012 ACM annual conference on Human Factors in Computing Systems, CHI ’12. ACM (2012)

  9. Ghose, A., Ipeirotis, P.G.: Designing novel review ranking systems: predicting the usefulness and impact of reviews. In ICEC ’07: Proceedings of the ninth international conference on Electronic commerce (2007)

  10. Gunning, R.: The Technique of Clear Writing. McGraw-Hill, New York (1952)

  11. Hall, C.E., Zarro, M.A.: What do you call it?: a comparison of library-created and user-created tags. In: Proceedings of the 11th annual international ACM/IEEE joint conference on Digital libraries, JCDL ’11. ACM (2011)

  12. Halpin, H., Robu, V., Shepherd, H.: The complex dynamics of collaborative tagging. In Proceedings of the 16th international conference on World Wide Web, WWW ’07 (2007)

  13. Harper, F.M., Moy, D., Konstan, J.A.: Facts or friends?: distinguishing informational and conversational questions in social q&a sites. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM (2009)

  14. Haslhofer, B., Robitza, W., Lagoze, C., Guimbretiere, F.: Semantic tagging on historical maps. In ACM Web Science 2013, Paris, France, ACM (2013)

  15. Kammerer, Y., Nairn, R., Pirolli, P., Chi, E.H.: Signpost from the masses: learning effects in an exploratory social tag search browser. Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. CHI ’09, pp. 625–634. New York , NY, USA, ACM (2009)

  16. Kim, S.-M., Pantel, P., Chklovski, T., Pennacchiotti, M.: Automatically assessing review helpfulness. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP ’06 (2006)

  17. Liu, J., Cao, Y., Lin, C.Y., Huang, Y., Zhou, M.: Low-quality product review detection in opinion summarization. In Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL) (2007)

  18. Liu, Y., Bian, J., Agichtein, E.: Predicting information seeker satisfaction in community question answering. In: Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval. ACM (2008)

  19. Lu, Y., Tsaparas, P., Ntoulas, A., Polanyi, L.: Exploiting social context for review quality prediction. In: Proceedings of the 19th international conference on World wide web, WWW ’10 (2010)

  20. Seki, K., Qin, H., Uehara, K.: Impact and prospect of social bookmarks for bibliographic information retrieval. In: Proceedings of the 10th annual joint conference on Digital libraries, JCDL ’10 (2010)

  21. Sigurbjörnsson, B., van Zwol, R.: Flickr tag recommendation based on collective knowledge. In: Proceedings of the 17th international conference on World Wide Web, WWW ’08. ACM (2008)

  22. Tausczik, Y.R., Pennebaker, J.W.: The psychological meaning of Words: LIWC and computerized text analysis methods. J. Lang. Soc. Psych. 29(1):24–54 (2010)

  23. Wagner, C., Rowe, M., Strohmaier, M., Alani, H.: What catches your attention? an empirical study of attention patterns in community forums. In ICWSM (2012)

  24. Weinberger, K.Q., Slaney, M., Van Zwol, R.: Resolving tag ambiguity. In: Proceedings of the 16th ACM international conference on Multimedia, MM ’08. ACM (2008)

  25. Wilson, T., Wiebe, J., Hoffmann, P.: Recognizing contextual polarity in phrase-level sentiment analysis. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT ’05, pages 347–354, Stroudsburg, PA, USA. Association for Computational Linguistics (2005)

  26. Siersdorfer, S., Chelaru, S., Nejdl, W., San Pedro, J.: How useful are your comments?: analyzing and predicting youtube comments and comment ratings. In: Proceedings of the 19th international conference on World wide web, WWW ’10. ACM (2010)

  27. Hsu, C.-F., Khabiri, E., Caverlee, J.: Ranking comments on the social web. In Proceedings of the 2009 International Conference on Computational Science and Engineering—Volume 04, CSE ’09, 90–97. Washington, DC, USA: IEEE Computer Society (2009)

  28. Abel, F., Celik, I., Houben, G.-J., Siehndel, P.: Leveraging the semantics of tweets for adaptive faceted search on twitter. In: Proceedings of the 10th International Conference on The Semantic Web, ISWC’11, pages 1–17. Springer-Verlag (2011)

  29. Lampe, C., Resnick, P.: Slash(dot) and burn: distributed moderation in a large online conversation space. In: Proceedings of the SIGCHI conference on Human factors in computing systems, CHI ’04 (2004)

Download references

Acknowledgments

This work was supported in part by a Marie Curie International Outgoing Fellowship within the 7th European Community Framework Program (PIOF-GA-2009-252206). We also thank members of the Library of Congress and especially Helena Zinkham for their insightful comments and advice on the result of the LOC project on Flickr Commons and the anonymous reviewers for feedback and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Elaheh Momeni.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Momeni, E., Haslhofer, B., Tao, K. et al. Sifting useful comments from Flickr Commons and YouTube. Int J Digit Libr 16, 161–179 (2015). https://doi.org/10.1007/s00799-014-0123-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00799-014-0123-1

Keywords

Navigation