Advertisement

Detection and visualization of misleading content on Twitter

  • Christina Boididou
  • Symeon Papadopoulos
  • Markos Zampoglou
  • Lazaros Apostolidis
  • Olga Papadopoulou
  • Yiannis Kompatsiaris
Regular Paper

Abstract

The problems of online misinformation and fake news have gained increasing prominence in an age where user-generated content and social media platforms are key forces in the shaping and diffusion of news stories. Unreliable information and misleading content are often posted and widely disseminated through popular social media platforms such as Twitter and Facebook. As a result, journalists and editors are in need of new tools that can help them speed up the verification process for content that is sourced from social media. Motivated by this need, in this paper, we present a system that supports the automatic classification of multimedia Twitter posts into credible or misleading. The system leverages credibility-oriented features extracted from the tweet and the user who published it, and trains a two-step classification model based on a novel semisupervised learning scheme. The latter uses the agreement between two independent pretrained models on new posts as guiding signals for retraining the classification model. We analyze a large labeled dataset of tweets that shared debunked fake and confirmed real images and videos, and show that integrating the newly proposed features, and making use of bagging in the initial classifiers and of the semisupervised learning scheme, significantly improves classification accuracy. Moreover, we present a Web-based application for visualizing and communicating the classification results to end users.

Keywords

Social media Verification Fake detection Information credibility 

References

  1. 1.
    Boididou C, Papadopoulos S, Kompatsiaris Y, Schifferes S, Newman N (2014) Challenges of computational verification in social multimedia. In: Proceedings of the companion publication of the 23rd international conference on world wide web companion, pp 743–748Google Scholar
  2. 2.
    Boididou C, Andreadou K, Papadopoulos S, Dang-Nguyen DT, Boato G, Riegler M, Kompatsiaris Y (2015a) Verifying multimedia use at mediaeval 2015. In: MediaEval 2015 workshop, Sept 14–15, 2015, Wurzen, GermanyGoogle Scholar
  3. 3.
    Boididou C, Papadopoulos S, Dang-Nguyen DT, Boato G, Kompatsiaris Y (2015b) The certh-unitn participation @ verifying multimedia use 2015. In: MediaEval 2015 workshop, Sept 14–15, 2015, Wurzen, GermanyGoogle Scholar
  4. 4.
    Boididou C, Papadopoulos S, Dang-Nguyen D, Boato G, Riegler M, Middleton SE, Petlund A, Kompatsiaris Y (2016a) Verifying multimedia use at mediaeval 2016. In: Working notes proceedings of the MediaEval 2016 workshop, Oct 20–21, 2016, Hilversum, The NetherlandsGoogle Scholar
  5. 5.
    Boididou C, Papadopoulos S, Middleton SE, Dang-Nguyen D, Riegler M, Petlund A, Kompatsiaris Y (2016b) The VMU participation @ verifying multimedia use 2016. In: Working notes proceedings of the MediaEval 2016 workshop, Oct 20–21, 2016, The NetherlandsGoogle Scholar
  6. 6.
    Boididou C, Middleton SE, Jin Z, Papadopoulos S, Dang-Nguyen DT, Boato G, Kompatsiaris Y (2017a) Verifying information with multimedia content on twitter. Multimedia Tools Appl.  https://doi.org/10.1007/s11042-017-5132-9 Google Scholar
  7. 7.
    Boididou C, Papadopoulos S, Apostolidis L, Kompatsiaris Y (2017b) Learning to detect misleading content on twitter. In: Proceedings of the 2017 ACM on international conference on multimedia retrieval, ICMR ’17. ACM, pp 278–286Google Scholar
  8. 8.
    Cao J, Jin Z, Zhang Y (2016) Mcg-ict at mediaeval 2016 verifying tweets from both text and visual content. In: Working notes proceedings of the MediaEval 2016 workshop, CEUR-WS.org, vol 1739, Oct 20–21, 2016, Hilversum, The NetherlandsGoogle Scholar
  9. 9.
    Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In: Proceedings of the 20th international conference on world wide web. ACM, pp 675–684Google Scholar
  10. 10.
    Gupta A, Lamba H, Kumaraguru P, Joshi A (2013) Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In: Proceedings of the 22nd international conference on world wide web companion, pp 729–736Google Scholar
  11. 11.
    Gupta A, Kumaraguru P, Castillo C, Meier P (2014) Tweetcred: a real-time web-based system for assessing credibility of content on twitter. In: Proceedings of 6th international conference on social informatics (SocInfo)Google Scholar
  12. 12.
    Hassan N, Adair B, Hamilton J, Li C, Tremayne M, Yang J, Yu C (2015) The quest to automate fact-checking. In: Proceedings of the 2015 computation and journalism symposium, pp 1–5Google Scholar
  13. 13.
    Jin Z, Cao J, Zhang Y, Zhang Y (2015) Mcg-ict at mediaeval 2015: verifying multimedia use with a two-level classification model. In: MediaEval 2015 workshop, Sept 14–15, 2015, Wurzen, GermanyGoogle Scholar
  14. 14.
    Jin Z, Cao J, Zhang Y, Zhou J, Tian Q (2017) Novel visual and statistical image features for microblogs news verification. IEEE Trans Multimedia 19(3):598–608CrossRefGoogle Scholar
  15. 15.
    Kanske P, Kotz SA (2010) Leipzig affective norms for german: a reliability study. Behav Res Methods 42(4):987–991CrossRefGoogle Scholar
  16. 16.
    Klein D, Manning CD (2003) Accurate unlexicalized parsing. In: Proceedings of the 41st annual meeting on association for computational linguistics—Volume 1, Association for Computational Linguistics. ACL’03, pp 423–430Google Scholar
  17. 17.
    Kumar S, West R, Leskovec J (2016) Disinformation on the web: Impact, characteristics, and detection of wikipedia hoaxes. In: Proceedings of the 25th international conference on world wide web, WWW 2016, Montreal, Canada, April 11–15, 2016. ACM, pp 591–602Google Scholar
  18. 18.
    Maigrot C, Claveau V, Kijak E, Sicre R (2016) Mediaeval 2016: A multimodal system for the verifying multimedia use task. In: Working notes proceedings of the MediaEval 2016 workshop, Hilversum, vol 1739, CEUR-WS.org, Oct 20-21, 2016, The NetherlandsGoogle Scholar
  19. 19.
    Martin N, Comm B (2014) Information verification in the age of digital journalism. In: Special libraries association annual conference, VancouverGoogle Scholar
  20. 20.
    Metaxas P, Finn S, Mustafaraj E (2015) Using twittertrails.com to investigate rumor propagation. In: Proceedings of the 18th ACM conference companion on computer supported cooperative work & social computing. ACM, pp 69–72Google Scholar
  21. 21.
    Middleton S (2015) Extracting attributed verification and debunking reports from social media: Mediaeval-2015 trust and credibility analysis of image and video. In: MediaEval 2015 workshop, Sept 14–15, 2015, Wurzen, GermanyGoogle Scholar
  22. 22.
    O’Donovan J, Kang B, Meyer G, Hollerer T, Adalii S (2012) Credibility in context: An analysis of feature distributions in twitter. In: 2012 international conference on privacy, security, risk and trust (PASSAT) and 2012 international conference on social computing (SocialCom). IEEE, pp 293–301Google Scholar
  23. 23.
    Oikawa MA, Dias Z, de Rezende Rocha A, Goldenstein S (2016) Manifold learning and spectral clustering for image phylogeny forests. IEEE Trans Inf Forensics Secur 11(1):5–18CrossRefGoogle Scholar
  24. 24.
    Pandey RC, Singh SK, Shukla KK (2016) Passive forensics in image and video using noise features: a review. Digit Investig 19:1–28.  https://doi.org/10.1016/j.diin.2016.08.002 CrossRefGoogle Scholar
  25. 25.
    Phan QT, Budroni A, Pasquini C, Natale FGBD (2016) A hybrid approach for multimedia use verification. In: Working notes proceedings of the MediaEval 2016 Workshop, vol 1739, CEUR-WS.org, Octob 20–21, 2016, Hilversum, The NetherlandsGoogle Scholar
  26. 26.
    Ratkiewicz J, Conover M, Meiss M, Gonçalves B, Patil S, Flammini A, Menczer F (2011) Truthy: mapping the spread of astroturf in microblog streams. In: Proceedings of the 20th international conference companion on world wide web. ACM, pp 249–252Google Scholar
  27. 27.
    Redondo J, Fraga I, Padrón I, Comesaña M (2007) The spanish adaptation of anew (affective norms for english words). Beh Res Methods 39(3):600–605CrossRefGoogle Scholar
  28. 28.
    Resnick P, Carton S, Park S, Shen Y, Zeffer N (2014) Rumorlens: a system for analyzing the impact of rumors and corrections in social media. In: Proceedings of computational journalism conferenceGoogle Scholar
  29. 29.
    Rubin VL, Conroy NJ, Chen Y, Cornwell S (2016) Fake news or truth? using satirical cues to detect potentially misleading news. In: Proceedings of NAACL-HLT, pp 7–17Google Scholar
  30. 30.
    Shao C, Ciampaglia GL, Flammini A, Menczer F (2016) Hoaxy: a platform for tracking online misinformation. In: Proceedings of the 25th international conference companion on world wide web, pp 745–750Google Scholar
  31. 31.
    Silva E, de Carvalho TJ, Ferreira A, Rocha A (2015) Going deeper into copy-move forgery detection: exploring image telltales via multi-scale analysis and voting processes. J Vis Commun Image Represent 29:16–32CrossRefGoogle Scholar
  32. 32.
    Silverman C (2013) Verification handbook. The European Journalism Centre (EJC), MaastrichtGoogle Scholar
  33. 33.
    Spyromitros-Xioufis E, Papadopoulos S, Kompatsiaris I, Tsoumakas G, Vlahavas I (2014) A comprehensive study over VLAD and Product Quantization in large-scale image retrieval. IEEE Trans Multimedia 16(6):1713–1728CrossRefGoogle Scholar
  34. 34.
    Sun S, Liu H, He J, Du X (2013) Detecting event rumors on sina weibo automatically. In: Web technologies and applications—15th Asia-Pacific web conference, APWeb 2013, Sydney, Australia, April 4–6, 2013. Proceedings, lecture notes in computer science, vol 7808. Springer, pp 120–131Google Scholar
  35. 35.
    Teyssou D, Leung JM, Apostolidis E, Apostolidis K, Papadopoulos S, Zampoglou M, Papadopoulou O, Mezaris V (2017) The invid plug-in: web video verification on the browser. In: Proceedings of the 1st workshop on multimedia verificationGoogle Scholar
  36. 36.
    Tsakalidis A, Papadopoulos S, Kompatsiaris I (2014) An ensemble model for cross-domain polarity classification on twitter. In: Web information systems engineering—WISE 2014. Springer, pp 168–177Google Scholar
  37. 37.
    Volkova S, Shaffer K, Jang JY, Hodas N (2017) Separating facts from fiction: linguistic models to classify suspicious and trusted news posts on twitter. In: Proceedings of the 55th annual meeting of the association for computational linguistics, vol 2, pp 647–653Google Scholar
  38. 38.
    Vosoughi S, Mohsenvand MN, Roy D (2017) Rumor gauge: predicting the veracity of rumors on twitter. ACM Trans Knowl Discov Data 11:1–36CrossRefGoogle Scholar
  39. 39.
    Wu K, Yang S, Zhu KQ (2015) False rumors detection on sina weibo by propagation structures. In: 31st IEEE international conference on data engineering, ICDE 2015, Seoul, South Korea, April 13–17, 2015. IEEE Computer Society, pp 651–662Google Scholar
  40. 40.
    Zampoglou M, Papadopoulos S, Kompatsiaris Y (2015) Detecting image splicing in the wild (web). In: IEEE international conference on multimedia & expo workshops (ICMEW). IEEE, pp 1–6Google Scholar
  41. 41.
    Zampoglou M, Papadopoulos S, Kompatsiaris Y, Bouwmeester R, Spangenberg J (2016) Web and social media image forensics for news professionals. In: Social media in the newsroom, papers from the 2016 ICWSM workshop, vol WS-16-19, Cologne, Germany, May 17, 2016. AAAI PressGoogle Scholar
  42. 42.
    Zampoglou M, Papadopoulos S, Kompatsiaris Y (2017) A large-scale evaluation of splicing localization algorithms for web images. Multimedia Tools Appl 76(4):4801–4834CrossRefGoogle Scholar
  43. 43.
    Zubiaga A, Aker A, Bontcheva K, Liakata M, Procter R (2017) Detection and resolution of rumours in social media: a survey. CoRR. arXiv:1704.00656

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2017

Authors and Affiliations

  1. 1.Urban Big Data CentreGlasgowUK
  2. 2.CERTH-ITIThessalonikiGreece

Personalised recommendations