Verifying information with multimedia content on twitter

A comparative study of automated approaches

Abstract

An increasing amount of posts on social media are used for disseminating news information and are accompanied by multimedia content. Such content may often be misleading or be digitally manipulated. More often than not, such pieces of content reach the front pages of major news outlets, having a detrimental effect on their credibility. To avoid such effects, there is profound need for automated methods that can help debunk and verify online content in very short time. To this end, we present a comparative study of three such methods that are catered for Twitter, a major social media platform used for news sharing. Those include: a) a method that uses textual patterns to extract claims about whether a tweet is fake or real and attribution statements about the source of the content; b) a method that exploits the information that same-topic tweets should be also similar in terms of credibility; and c) a method that uses a semi-supervised learning scheme that leverages the decisions of two independent credibility classifiers. We perform a comprehensive comparative evaluation of these approaches on datasets released by the Verifying Multimedia Use (VMU) task organized in the context of the 2015 and 2016 MediaEval benchmark. In addition to comparatively evaluating the three presented methods, we devise and evaluate a combined method based on their outputs, which outperforms all three of them. We discuss these findings and provide insights to guide future generations of verification tools for media professionals.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

Notes

  1. 1.

    http://www.theguardian.com/uk/interactive/2011/dec/07/london-riots-twitter

  2. 2.

    Exif metadata contain information about the date, time and location an image was taken, the model of the device, and copyright information, which can be very useful when assessing the credibility of multimedia content [34].

  3. 3.

    http://multimediaeval.org/

  4. 4.

    https://code.google.com/p/language-detection/

  5. 5.

    http://onlineslangdictionary.com/word-list/0-a/

  6. 6.

    http://www.languagerealm.com/spanish/spanishslang.php

  7. 7.

    http://nlp.stanford.edu/software/lex-parser.shtml

  8. 8.

    http://simple.wikipedia.org/wiki/Flesch_Reading_Ease

  9. 9.

    http://download.geonames.org/export/dump/cities1000.zip

  10. 10.

    https://www.mywot.com/

  11. 11.

    http://wwwranking.webdatacommons.org/more.html

  12. 12.

    http://data.alexa.com/data?cli=10&dat=snbamz&url=google.gr

  13. 13.

    http://rosettacode.org/wiki/Levenshtein_distance#Java

  14. 14.

    https://github.com/MKLab-ITI/image-verification-corpus/

References

  1. 1.

    Aiello LM, Petkos G, Martín CJ, Corney D, Papadopoulos S, Skraba R, Göker A, Kompatsiaris I, Jaimes A (2013) Sensing trending topics in twitter. IEEE Trans Multimedia 15(6):1268–1282. https://doi.org/10.1109/TMM.2013.2265080

    Article  Google Scholar 

  2. 2.

    Ardizzone E, Bruno A, Mazzola G (2015) Copy move forgery detection by matching triangles of keypoints. IEEE Trans Inf Forensics Secur 10(10):2084–2094

    Article  Google Scholar 

  3. 3.

    Bianchi T, Piva A (2012) Image forgery localization via block-grained analysis of jpeg artifacts. IEEE Trans Inf Forensics Secur 7(3):1003–1017

    Article  Google Scholar 

  4. 4.

    Boididou C, Andreadou K, Papadopoulos S, Dang-Nguyen DT, Boato G, Riegler M, Kompatsiaris Y (2015) Verifying multimedia use at mediaeval 2015. In: Mediaeval 2015 workshop, September 14–15, 2015, Wurzen, Germany

  5. 5.

    Boididou C, Papadopoulos S, Apostolidis L, Kompatsiaris Y (2017) Learning to detect misleading content on twitter. In: Proceedings of the 2017 ACM on international conference on multimedia retrieval. ACM, pp 278–286

  6. 6.

    Boididou C, Papadopoulos S, Dang-Nguyen D, Boato G, Riegler M, Middleton SE, Petlund A, Kompatsiaris Y (2016) Verifying multimedia use at mediaeval 2016. In: Working notes proceedings of the mediaeval 2016 workshop, Hilversum, The Netherlands, October 20–21, 2016. http://ceur-ws.org/Vol-1739/MediaEval_2016_paper_3.pdf

  7. 7.

    Boididou C, Papadopoulos S, Dang-Nguyen DT, Boato G, Kompatsiaris Y (2015) The CERTH-UNITN participation @ verifying multimedia use 2015. In: Working notes proceedings of the mediaeval 2015 workshop. Wurzen, Germany

  8. 8.

    Boididou C, Papadopoulos S, Kompatsiaris Y, Schifferes S, Newman N (2014) Challenges of computational verification in social multimedia. In: Proceedings of the companion publication of the 23rd international conference on world wide web companion, pp 743–748. https://doi.org/10.1145/2567948.2579323

  9. 9.

    Canini KR, Suh B, Pirolli PL (2011) Finding credible information sources in social networks based on content and social structure. In: Privacy, security, risk and trust (passat), 2011 ieee third international conference on and 2011 ieee third international conference on social computing (socialcom). IEEE, pp 1–8

  10. 10.

    Carton S, Adar E, Park S, Mei Q, Zeffer N, Resnick P (2015) Audience analysis for competing memes in social media. In: Ninth international AAAI conference on web and social media

  11. 11.

    Castillo C, Mendoza M, Poblete B (2011) Information credibility on twitter. In: Proceedings of the 20th international conference on world wide web. ACM, pp 675–684

  12. 12.

    Conotter V, Boato G, Farid H (2010) Detecting photo manipulation on signs and billboards. In: IEEE international conference on image processing, pp 1741–1744

  13. 13.

    Conotter V, Dang-Nguyen DT, Riegler M, Boato G, Larson M (2014) A crowdsourced data set of edited images online. In: Proceedings of the 2014 international ACM workshop on crowdsourcing for multimedia, crowdMM ’14. https://doi.org/10.1145/2660114.2660120. ACM, New York, pp 49–52

  14. 14.

    Ferrara P, Bianchi T, Rosa AD, Piva A (2012) Image forgery localization via fine-grained analysis of cfa artifacts. IEEE Trans Inf Forensics Secur 7(5):1566–1577

    Article  Google Scholar 

  15. 15.

    Gupta A, Kumaraguru P (2012) Twitter explodes with activity in mumbai blasts! a lifeline or an unmonitored daemon in the lurking? In: IIIT, Delhi, Technical report, IIITD-TR-2011-005

  16. 16.

    Gupta A, Kumaraguru P, Castillo C, Meier P (2014) Tweetcred: a real-time web-based system for assessing credibility of content on twitter. In: Proceedings 6th international conference on social informatics (socinfo)

  17. 17.

    Gupta A, Lamba H, Kumaraguru P, Joshi A (2013) Faking sandy: characterizing and identifying fake images on twitter during hurricane sandy. In: Proceedings of the 22nd international conference on world wide web companion. International world wide web conferences steering committee, pp 729– 736

  18. 18.

    Jian L, Xiaolong L, Bin Y, Xingming S, Li J, Li X, Yang B, Sun X (2015) Segmentation-based image copy-move forgery detection scheme. IEEE Trans Inf Forensics Secur 10(3):507– 518

    Article  Google Scholar 

  19. 19.

    Jin Z, Cao J, Jiang YG, Zhang Y (2014) News credibility evaluation on microblog with a hierarchical propagation model. In: 2014 IEEE international conference on data mining (ICDM). IEEE, pp 230– 239

  20. 20.

    Jin Z, Cao J, Zhang Y, Luo J (2016) News verification by exploiting conflicting social viewpoints in microblogs. In: AAAI 2016. AAAI

  21. 21.

    Jin Z, Cao J, Zhang Y, Zhang Y (2015) MCG-ICT at Mediaeval 2015: Verifying multimedia use with a two-level classification model. In: Working notes proceedings of the MediaEval 2015 workshop. Wurzen, Germany

  22. 22.

    Kee E, Johnson MK, Farid H (2011) Digital image authentication from jpeg headers. IEEE Trans Inf Forensics Secur 6(3-2):1066–1075

    Article  Google Scholar 

  23. 23.

    Li W, Yuan Y, Yu N (2009) Passive detection of doctored jpeg image via block artifact grid extraction. IEEE Trans Signal Process 89(9):1821–1829

    MATH  Google Scholar 

  24. 24.

    Luo W, Huang J, Qiu G (2010) JPEG Error analysis and its applications to digital image forensics. IEEE Trans Inf Forensics Secur 5(3):480–491

    Article  Google Scholar 

  25. 25.

    Martinez-Romo J, Araujo L (2013) Detecting malicious tweets in trending topics using a statistical analysis of language. Expert Syst Appl 40(8):2992–3000

    Article  Google Scholar 

  26. 26.

    Middleton SE (2015) Extracting attributed verification and debunking reports from social media: MediaEval-2015 trust and credibility analysis of image and video. In: Working notes proceedings of the MediaEval 2015 workshop. Wurzen, Germany

  27. 27.

    Middleton SE, Middleton L, Modafferi S (2014) Real-time crisis mapping of natural disasters using social media. IEEE Intell Syst 29(2):9–17

    Article  Google Scholar 

  28. 28.

    O’Brien JF, Farid H (2012) Exposing photo manipulation with inconsistent reflections. ACM Trans Graph 31(1):4:1–4:11

    Google Scholar 

  29. 29.

    O’Donovan J, Kang B, Meyer G, Hollerer T, Adalii S (2012) Credibility in context: an analysis of feature distributions in twitter. In: Privacy, security, risk and trust (PASSAT), 2012 international conference on and 2012 international confernece on social computing (socialcom). IEEE, pp 293– 301

  30. 30.

    Pasquini C, Boato G, Perez-Gonzalez F (2014) Multiple jpeg compression detection by means of benford-fourier coefficients. In: Proceedings of the workshop on information forensics and security. IEEE

  31. 31.

    Pasquini C, Perez-Gonzalez F, Boato G (2014) A benford-fourier jpeg compression detector. In: Proceedings of the international conference on image processing. IEEE, pp 5322– 5326

  32. 32.

    Procter R, Vis F, Voss A (2013) Reading the riots on twitter: methodological innovation for the analysis of big data. Int J Soc Res Methodol 16(3):197–214

    Article  Google Scholar 

  33. 33.

    Ratkiewicz J, Conover M, Meiss M, Gonçalves B, Patil S, Flammini A, Menczer F (2011) Truthy: mapping the spread of astroturf in microblog streams. In: Proceedings of the 20th international conference companion on world wide web. ACM, pp 249–252

  34. 34.

    Silverman C (ed) (2014) Verification handbook: a definitive guide to verifying digital content for emergency coverage. European Journalism Centre

  35. 35.

    Starbird K, Muzny G, Palen L (2012) Learning from the crowd: collaborative filtering techniques for identifying on-the-ground twitterers during mass disruptions. In: Proceedings of the 9th international conference on information systems crisis response management Iscram

  36. 36.

    Stringhini G, Kruegel C, Vigna G (2010) Detecting spammers on social networks. In: Proceedings of the 26th annual computer security applications conference. ACM, pp 1–9

  37. 37.

    Tsakalidis A, Papadopoulos S, Kompatsiaris I (2014) An ensemble model for cross-domain polarity classification on twitter. In: Web information systems engineering–WISE 2014. Springer, pp 168– 177

  38. 38.

    Zampoglou M, Papadopoulos S, Kompatsiaris Y (2015) Detecting image splicing in the wild (WEB). In: 2015 IEEE international conference on multimedia & expo workshops, ICME workshops 2015, Turin, Italy, June 29–July 3, 2015. https://doi.org/10.1109/ICMEW.2015.7169839, pp 1–6

  39. 39.

    Zampoglou M, Papadopoulos S, Kompatsiaris Y (2017) Large-scale evaluation of splicing localization algorithms for web images. Multimed Tools Appl 76(4):4801–4834

    Article  Google Scholar 

  40. 40.

    Zhao Z, Resnick P, Mei Q (2015) Enquiring minds: early detection of rumors in social media from enquiry posts. In: Proceedings of the 24th international conference on world wide web. International world wide web conferences steering committee, pp 1395–1405

  41. 41.

    Zubiaga A, Aker A, Bontcheva K, Liakata M, Procter R (2017) Detection and resolution of rumours in social media: a survey. arXiv:1704.00656

Download references

Acknowledgements

This work has been supported by the REVEAL and InVID projects, partially funded by the European Commission (FP7-610928 and H2020-687786 respectively).

Author information

Affiliations

Authors

Corresponding author

Correspondence to Symeon Papadopoulos.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Boididou, C., Middleton, S.E., Jin, Z. et al. Verifying information with multimedia content on twitter. Multimed Tools Appl 77, 15545–15571 (2018). https://doi.org/10.1007/s11042-017-5132-9

Download citation

Keywords

  • Fake detection
  • Verification
  • Credibility
  • Veracity
  • Trust
  • Social media
  • Twitter
  • Multimedia