Determining the Veracity of Rumours on Twitter

  • Georgios Giasemidis
  • Colin Singleton
  • Ioannis Agrafiotis
  • Jason R. C. Nurse
  • Alan Pilgrim
  • Chris Willis
  • D. V. Greetham
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10046)

Abstract

While social networks can provide an ideal platform for up-to-date information from individuals across the world, it has also proved to be a place where rumours fester and accidental or deliberate misinformation often emerges. In this article, we aim to support the task of making sense from social media data, and specifically, seek to build an autonomous message-classifier that filters relevant and trustworthy information from Twitter. For our work, we collected about 100 million public tweets, including users’ past tweets, from which we identified 72 rumours (41 true, 31 false). We considered over 80 trustworthiness measures including the authors’ profile and past behaviour, the social network connections (graphs), and the content of tweets themselves. We ran modern machine-learning classifiers over those measures to produce trustworthiness scores at various time windows from the outbreak of the rumour. Such time-windows were key as they allowed useful insight into the progression of the rumours. From our findings, we identified that our model was significantly more accurate than similar studies in the literature. We also identified critical attributes of the data that give rise to the trustworthiness scores assigned. Finally we developed a software demonstration that provides a visual user interface to allow the user to examine the analysis.

Keywords

Logistic Regression Propagation Graph Random Forest Model Decision Tree Algorithm Trustworthiness Score 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Notes

Acknowledgements

This work was partly supported by UK Defence Science and Technology Labs under Centre for Defence Enterprise grant CDE42008. We thank Andrew Middleton for his helpful comments during the project. We would also like to thank Nathaniel Charlton and Matthew Edgington for their assistance in collecting and preprocessing part of the data.

Supplementary material

References

  1. 1.
    Cambridge Advanced Learner’s Dictionary and Thesaurus. Cambridge University Press. http://dictionary.cambridge.org/dictionary/english/rumour
  2. 2.
    Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)MATHGoogle Scholar
  3. 3.
    Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International conference on World wide web, pp. 675–684. ACM (2011)Google Scholar
  4. 4.
    Castillo, C., Mendoza, M., Poblete, B.: Predicting information credibility in time-sensitive social media. Internet Res. 23(5), 560–588 (2013)CrossRefGoogle Scholar
  5. 5.
    Pennebaker, J.W., Booth, R.J., Boyd, R.L., Francis, M.E.: Linguistic Inquiry and Word Count: LIWC 2015. Pennebaker Conglomerates, Austin (2015). www.LIWC.net Google Scholar
  6. 6.
    Finn, S., Metaxas, T.P., Mustafraj, E.: Investigating rumor propagation with TwitterTrails. arXiv:1411.3550 (2014)
  7. 7.
    Fox, J.: Applied Regression Analysis, Linear Models, and Related Methods. Sage Publications, London (1997)Google Scholar
  8. 8.
    Gil, Y., Artz, D.: Towards content trust of web resources. Web Semant. Sci. Serv. Agents World Wide Web 5(4), 227–239 (2007)CrossRefGoogle Scholar
  9. 9.
    Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)MATHGoogle Scholar
  10. 10.
    Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2009)CrossRefMATHGoogle Scholar
  11. 11.
    Kelton, K., Fleischmann, K., Wallace, W.: Trust in digital information. J. Am. Soc. Inf. Sci. Technol. 59(3), 363–374 (2008)CrossRefGoogle Scholar
  12. 12.
    Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. The MIT Press, Cambridge (2009)MATHGoogle Scholar
  13. 13.
    Kwon, S., Cha, M., Jung, K., Chen, W., Wang, Y.: Prominent features of rumor propagation in online social media. In 2013 IEEE 13th International Conference on Data Mining, pp. 1103–1108. IEEE (2013)Google Scholar
  14. 14.
    Lomax, G.R., Hahs-Vaughn, D.L.: An Introduction to Statistical Concepts. Routledge, New York (2012)Google Scholar
  15. 15.
    Lukyanenko, R., Parsons, J.: Information quality research challenge: adapting information quality principles to user-generated content. J. Data Inf. Qual. (JDIQ) 6(1), 3 (2015)Google Scholar
  16. 16.
    Mai, J.: The quality and qualities of information. J. Am. Soc. Inf. Sci. Technol. 64(4), 675–688 (2013)CrossRefGoogle Scholar
  17. 17.
    Mendoza, M., Poblete, B., Castillo, C.: Twitter under crisis: can we trust what we RT? In: Proceedings of the First Workshop on Social Media Analytics, pp. 71–79. ACM, New York (2010)Google Scholar
  18. 18.
    Nurse, J.R.C., Agrafiotis, I., Goldsmith, M., Creese, S., Lamberts, K.: Two sides of the coin: measuring and communicating the trustworthiness of online information. J. Trust Manag. 1(5), 1–20 (2014). doi: 10.1186/2196-064X-1-5 Google Scholar
  19. 19.
    Nurse, J.R.C., Creese, S., Goldsmith, M., Rahman, S.S.: Supporting human decision-making online using information-trustworthiness metrics. In: Marinos, L., Askoxylakis, I. (eds.) HAS 2013. LNCS, vol. 8030, pp. 316–325. Springer, Heidelberg (2013). doi: 10.1007/978-3-642-39345-7_33 CrossRefGoogle Scholar
  20. 20.
    Nurse, J.R.C., Rahman, S.S., Creese, S., Goldsmith, M., Lamberts, K.: Information quality and trustworthiness: a topical state-of-the-art review. In: Proceedings of the International Conference on Computer Applications and Network Security (ICCANS) (2011)Google Scholar
  21. 21.
    Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)MathSciNetMATHGoogle Scholar
  22. 22.
    Pew Research Center: The evolving role of news on Twitter and Facebook (2015). http://www.journalism.org/2015/07/14/the-evolving-role-of-news-on-twitter-and-facebook
  23. 23.
    Powers, D.M.W.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)MathSciNetGoogle Scholar
  24. 24.
    Reuters Institute for the Study of Journalism: Digital news report 2015: tracking the future of news (2015). http://www.digitalnewsreport.org/survey/2015/social-networks-and-their-role-in-news-2015/
  25. 25.
    Seo, E., Mohapatra, P., Abdelzaher, T.: Identifying rumors and their sources in social networks. In: SPIE Defense, Security, and Sensing, p. 83891I. International Society for Optics and Photonics (2012)Google Scholar
  26. 26.
    Smola, A.J., Scholkopf, B.: A tutorial on support vector regression. Stat. Comput. 14, 199–222 (2004)MathSciNetCrossRefGoogle Scholar
  27. 27.
    The Guardian: How riot rumours spread on Twitter (2011). http://www.theguardian.com/uk/interactive/2011/dec/07/london-riots-twitter
  28. 28.
    Verleysen, M., François, D.: The curse of dimensionality in data mining and time series prediction. In: Cabestany, J., Prieto, A., Sandoval, F. (eds.) IWANN 2005. LNCS, vol. 3512, pp. 758–770. Springer, Heidelberg (2005). doi: 10.1007/11494669_93 CrossRefGoogle Scholar
  29. 29.
    Vosoughi, S.: Automatic detection and verification of rumors on Twitter. Ph.D. thesis, MIT (2015)Google Scholar
  30. 30.
    Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manag. Inf. Syst. 12(4), 5–33 (1996)CrossRefGoogle Scholar
  31. 31.
    Zubiaga, A., Liakata, M., Procter, R., Bontcheva, K., Tolmie, P.: Towards detecting rumours in social media. arXiv preprint arXiv:1504.04712 (2015)

Copyright information

© Springer International Publishing AG 2016

Authors and Affiliations

  • Georgios Giasemidis
    • 1
  • Colin Singleton
    • 1
  • Ioannis Agrafiotis
    • 2
  • Jason R. C. Nurse
    • 2
  • Alan Pilgrim
    • 3
  • Chris Willis
    • 3
  • D. V. Greetham
    • 4
  1. 1.CountingLab Ltd.ReadingUK
  2. 2.Department of Computer ScienceUniversity of OxfordOxfordUK
  3. 3.BAE Systems Applied IntelligenceChelmsfordUK
  4. 4.Department of Mathematics and StatisticsUniversity of ReadingReadingUK

Personalised recommendations