Abstract
While social networks can provide an ideal platform for up-to-date information from individuals across the world, it has also proved to be a place where rumours fester and accidental or deliberate misinformation often emerges. In this article, we aim to support the task of making sense from social media data, and specifically, seek to build an autonomous message-classifier that filters relevant and trustworthy information from Twitter. For our work, we collected about 100 million public tweets, including users’ past tweets, from which we identified 72 rumours (41 true, 31 false). We considered over 80 trustworthiness measures including the authors’ profile and past behaviour, the social network connections (graphs), and the content of tweets themselves. We ran modern machine-learning classifiers over those measures to produce trustworthiness scores at various time windows from the outbreak of the rumour. Such time-windows were key as they allowed useful insight into the progression of the rumours. From our findings, we identified that our model was significantly more accurate than similar studies in the literature. We also identified critical attributes of the data that give rise to the trustworthiness scores assigned. Finally we developed a software demonstration that provides a visual user interface to allow the user to examine the analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
We abandoned Neural Networks at early stages as the library implementation used was very slow and the results were underperforming.
- 6.
A phenomenon known as the “wisdom of the crowd”.
- 7.
We compute the average of the first eight models because this is the range where the classifiers peak their performance. As we argue in Sect. 3 all plots indicate that classifiers performance decreases when more than eight features are added.
References
Cambridge Advanced Learner’s Dictionary and Thesaurus. Cambridge University Press. http://dictionary.cambridge.org/dictionary/english/rumour
Bishop, C.M.: Pattern Recognition and Machine Learning. Information Science and Statistics. Springer, New York (2006)
Castillo, C., Mendoza, M., Poblete, B.: Information credibility on Twitter. In: Proceedings of the 20th International conference on World wide web, pp. 675–684. ACM (2011)
Castillo, C., Mendoza, M., Poblete, B.: Predicting information credibility in time-sensitive social media. Internet Res. 23(5), 560–588 (2013)
Pennebaker, J.W., Booth, R.J., Boyd, R.L., Francis, M.E.: Linguistic Inquiry and Word Count: LIWC 2015. Pennebaker Conglomerates, Austin (2015). www.LIWC.net
Finn, S., Metaxas, T.P., Mustafraj, E.: Investigating rumor propagation with TwitterTrails. arXiv:1411.3550 (2014)
Fox, J.: Applied Regression Analysis, Linear Models, and Related Methods. Sage Publications, London (1997)
Gil, Y., Artz, D.: Towards content trust of web resources. Web Semant. Sci. Serv. Agents World Wide Web 5(4), 227–239 (2007)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2009)
Kelton, K., Fleischmann, K., Wallace, W.: Trust in digital information. J. Am. Soc. Inf. Sci. Technol. 59(3), 363–374 (2008)
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. The MIT Press, Cambridge (2009)
Kwon, S., Cha, M., Jung, K., Chen, W., Wang, Y.: Prominent features of rumor propagation in online social media. In 2013 IEEE 13th International Conference on Data Mining, pp. 1103–1108. IEEE (2013)
Lomax, G.R., Hahs-Vaughn, D.L.: An Introduction to Statistical Concepts. Routledge, New York (2012)
Lukyanenko, R., Parsons, J.: Information quality research challenge: adapting information quality principles to user-generated content. J. Data Inf. Qual. (JDIQ) 6(1), 3 (2015)
Mai, J.: The quality and qualities of information. J. Am. Soc. Inf. Sci. Technol. 64(4), 675–688 (2013)
Mendoza, M., Poblete, B., Castillo, C.: Twitter under crisis: can we trust what we RT? In: Proceedings of the First Workshop on Social Media Analytics, pp. 71–79. ACM, New York (2010)
Nurse, J.R.C., Agrafiotis, I., Goldsmith, M., Creese, S., Lamberts, K.: Two sides of the coin: measuring and communicating the trustworthiness of online information. J. Trust Manag. 1(5), 1–20 (2014). doi:10.1186/2196-064X-1-5
Nurse, J.R.C., Creese, S., Goldsmith, M., Rahman, S.S.: Supporting human decision-making online using information-trustworthiness metrics. In: Marinos, L., Askoxylakis, I. (eds.) HAS 2013. LNCS, vol. 8030, pp. 316–325. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39345-7_33
Nurse, J.R.C., Rahman, S.S., Creese, S., Goldsmith, M., Lamberts, K.: Information quality and trustworthiness: a topical state-of-the-art review. In: Proceedings of the International Conference on Computer Applications and Network Security (ICCANS) (2011)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: machine learning in Python. J. Mach. Learn. Res. 12, 2825–2830 (2011)
Pew Research Center: The evolving role of news on Twitter and Facebook (2015). http://www.journalism.org/2015/07/14/the-evolving-role-of-news-on-twitter-and-facebook
Powers, D.M.W.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. J. Mach. Learn. Technol. 2(1), 37–63 (2011)
Reuters Institute for the Study of Journalism: Digital news report 2015: tracking the future of news (2015). http://www.digitalnewsreport.org/survey/2015/social-networks-and-their-role-in-news-2015/
Seo, E., Mohapatra, P., Abdelzaher, T.: Identifying rumors and their sources in social networks. In: SPIE Defense, Security, and Sensing, p. 83891I. International Society for Optics and Photonics (2012)
Smola, A.J., Scholkopf, B.: A tutorial on support vector regression. Stat. Comput. 14, 199–222 (2004)
The Guardian: How riot rumours spread on Twitter (2011). http://www.theguardian.com/uk/interactive/2011/dec/07/london-riots-twitter
Verleysen, M., François, D.: The curse of dimensionality in data mining and time series prediction. In: Cabestany, J., Prieto, A., Sandoval, F. (eds.) IWANN 2005. LNCS, vol. 3512, pp. 758–770. Springer, Heidelberg (2005). doi:10.1007/11494669_93
Vosoughi, S.: Automatic detection and verification of rumors on Twitter. Ph.D. thesis, MIT (2015)
Wang, R.Y., Strong, D.M.: Beyond accuracy: what data quality means to data consumers. J. Manag. Inf. Syst. 12(4), 5–33 (1996)
Zubiaga, A., Liakata, M., Procter, R., Bontcheva, K., Tolmie, P.: Towards detecting rumours in social media. arXiv preprint arXiv:1504.04712 (2015)
Acknowledgements
This work was partly supported by UK Defence Science and Technology Labs under Centre for Defence Enterprise grant CDE42008. We thank Andrew Middleton for his helpful comments during the project. We would also like to thank Nathaniel Charlton and Matthew Edgington for their assistance in collecting and preprocessing part of the data.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendices
Appendix
A Data Collection Process
Our data collection process consists of four main steps:
-
1.
Collection of tweets with a specific keyword, e.g. “#ParisAttacks” or “Brussels”. The Twitter API only allows the collection of such tweets within a ten-day window. For this reason this step must start as soon as an event happens or a rumour begins.
-
(a)
Manual analysis of tweets and search for rumours. In this step we filter out all the irrelevant tweets. For example, if we collected tweets containing the keyword “Brussels” (due to the unfortunate Brussels attacks), we ignore tweets talking about holidays in Brussels.
-
(b)
Collection of more tweets relevant to the story with keywords that we missed in the beginning of Step 1 (this step is optional). For example, while searching for rumours we might come across tweets talking about another rumour. We add the keyword that describes this new rumour in our tweet collection.
-
(c)
Categorise tweets into rumours. Group all tweets referring to the same rumour.
-
(d)
Identify all the unique users involved in a rumour. This set of users will be used in Steps 2 to 4.
-
(a)
-
2.
Collect users’ most recent 400 tweets, posted before the start of the rumour. This step is required because we aim to examine the users’ past behaviour and sentiment, e.g. whether users’ writing style or sentiment changes during the rumour, and whether these features are significant for the model. To the best of our knowledge, this set of features is considered for the first time in the academic literature in building a rumour classifier.
-
3.
Collect users’ followees (friends). This data is essential for making the propagation graph, see Sect. 2 and Appendix B.
-
4.
Collect users’ information, including user’s registration date and time, description, whether account is verified or not etc.
1.1 A.1 Rumours Summary Statistics
We provide a summary statistics table of the 72 collected rumours, see Table 2. This table shows the total number, mean, median, etc., of the distributions of the number of tweets, the percentage of supporting tweets, etc., of the 72 rumours, as well as some statistics of four example rumours. We collected about a 100 million tweets, including users’ past tweets. From the collected tweets, about 327.5 thousand tweets are part of rumours. These tweets contributed to the message-based features of the classification methods. The users’ past tweets contributed only to the features capturing a user’s past behaviour.
B Making the Propagation Graph
Nodes in the propagation tree correspond to unique users. Edges are drawn between users who retweet messages. However the retweet relationship cannot be directly inferred from the Twitter data. Consider a scenario with three users, A, B and C. User A posts an original tweet. User B sees the tweet from user A and retweets it. Twitter API returns an edge between user A and user B. If user C sees the tweet from user B and retweets it, Twitter API returns an edge between the original user A and user C, even though user A is not a friend with user C and there is no way user C could have seen the tweet from user A. To overcome this, we have collected the users followees. Therefore, in our scenario user B is connected to user C only if the retweet timestamp of user C is later than the retweet of user B and user B is in the followees list of user C.
C A Practical Example for Using Formula (1)
Here, we elaborate on formula (1) and present a practical example. For simplicity reasons and to avoid confusion we define support, \(S^{(i)}\), neutral, \(N^{(i)}\), and against, \(A^{(i)}\), terms in formula (1) following the example attributes given in Sect. 2.2. The generalisations are straightforward. If the attribute of the tweet is a binary indicator, for example whether a tweet contains a URL link or not, we define
If the attribute of the tweet is continuous, for example, the number of words in a tweet, we then define
These expressions are then combined through formula (1) to give the relevant feature of the rumour.
D Feature Reduction Methods
Since our dataset consists of 72 rumours, from theoretical and experimental arguments, we expect the relevant features to be about 10. We expect models with as many as 20 features to begin to show a decrease in performance. For this reason we set the upper bound on the number of features to be 30 and aim to examine models with an increasing number of features from 1 to 30. If this bound proves to be low we will reconsider this choice. However as it becomes evident in Sect. 3, this bound is satisfactory.
In this study we use four methods which are combinations of those described so far. For filtering we use the ANOVA F-test [14].
-
Method 1.
A combination of filter method, random wrapper and deterministic wrapper
-
(a)
Use ANOVA F-Statistics for filtering. Keep the 30-best scoring features.
-
(b)
From those 30-best we applied the classifier to 100,000 different combinations of 3 features to find the combination of 3 which maximise the \(F_1\)-score.
-
(c)
Add one-by-one the remaining 27 features by applying the classifier and keeping the one with the best \(F_1\)-score in each round.
-
(a)
-
Method 2.
A forward selection deterministic wrapper method
-
(a)
Apply the classifier to all features individually and select the one which maximises the \(F_1\)-score (from all available features, no pre-filtering is required).
-
(b)
Scan (by applying the classifier) all remaining features to find the combination of two (one from step a.) that maximises the \(F_1\)-score.
-
(c)
Continue adding one-by-one the features which maximise the \(F_1\)-score until the number of features reaches 30.
-
(a)
-
Method 3.
A combination of filter method and forward selection method
-
(a)
Use the ANOVA F-Statistics for filtering and keep the 30-best scoring features.
-
(b)
Apply the classifier and find the best-scoring, i.e. maximum \(F_1\)-score, from the 30-best selected from the filtering method (step a).
-
(c)
Continue adding one-by-one the features which maximise the classification \(F_1\)-score.
-
(a)
-
Method 4.
A feature transformation method
-
(a)
Use a feature transformation method, the principal component analysis. Keep the 30-best components.
-
(b)
Start with the principal component from the 30-best selected from step a.
-
(c)
Start adding the components one after the other.
-
(a)
We apply each method to each classifier separately, using scikit-learn’s default parameters, and assess it using k-fold cross validation. We have abandoned the Neural Network method for two reasons. First its performance was poor compared to the other methods and secondly it required long computational times which slowed down considerably the analysis of the results. We plot the \(F_1\)-score as a function of the number of features for the remaining classifiers and each feature reduction method, see Fig. 4.
We observe that the second method (red line in Fig. 4) outperforms, in almost all cases, all the other techniques. Similar plots are produced and same conclusion is reached for the other classifiers too. Therefore we can safely conclude that the forward selection deterministic wrapper is consistently the best-performing method of feature reduction for all classifiers.
E Further Results on Classifier Selection
In Sect. 3 we present the results from running several classifiers for thirty models, each model having an increasing number of features from one to thirty. Here we present more results that support our choice for feature selection.
In Fig. 5 we plot the average \(F_1\)-score for each method. This is a two-column plot. The first column (blue) corresponds to the average \(F_1\)-score of all 30 models. The second column (red) is the average \(F_1\)-score of the first eight models (those with number of features from 1 to 8)Footnote 7.
F Visualisation Tool
As a by-product of our modelling, we also developed a software tool which helps the user to visualise the results and gain a deeper understanding of the rumours, see Fig. 6. The tool consists of three layers. On the first layer the user selects a topic of interest (e.g. “Paris Attacks”). This directs to the second layer which displays all the relevant rumours with a basic summary (e.g. the rumour claim, timestamp of the first tweet, a word cloud, distribution of the tweets that are in favour, neutral or against the rumour and the modelled veracity). After selecting a rumour of interest, the user is navigated to the third layer, shown in Fig. 6. There, the tool shows several figures, such as the propagation forest (supporting, neutral and denying trees are coloured in green, grey and red respectively), a histogram showing the number of tweets in favour of the rumour, against the rumour, and those that are neutral, a plot of classifier’s features and the rumour veracity. A time-slider is provided to allow the user to navigate through the history of the rumour by selecting one of the available time steps. Moving the slider the user can investigate how the rumour, its veracity and the key features evolve over time. This gives the flexibility to the user to explore the key factors that affect the veracity of the rumour.
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Giasemidis, G. et al. (2016). Determining the Veracity of Rumours on Twitter. In: Spiro, E., Ahn, YY. (eds) Social Informatics. SocInfo 2016. Lecture Notes in Computer Science(), vol 10046. Springer, Cham. https://doi.org/10.1007/978-3-319-47880-7_12
Download citation
DOI: https://doi.org/10.1007/978-3-319-47880-7_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47879-1
Online ISBN: 978-3-319-47880-7
eBook Packages: Computer ScienceComputer Science (R0)