The Case for Latent Variable Vs Deep Learning Methods in Misinformation Detection: An Application to COVID-19

Moroney, Caitlin; Crothers, Evan; Mittal, Sudip; Joshi, Anupam; Adalı, Tülay; Mallinson, Christine; Japkowicz, Nathalie; Boukouvalas, Zois

doi:10.1007/978-3-030-88942-5_33

Caitlin Moroney¹⁰,
Evan Crothers¹¹,
Sudip Mittal¹²,
Anupam Joshi¹³,
Tülay Adalı¹³,
Christine Mallinson¹³,
Nathalie Japkowicz¹⁰ &
…
Zois Boukouvalas¹⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12986))

Included in the following conference series:

International Conference on Discovery Science

1644 Accesses
6 Citations
35 Altmetric

Abstract

The detection and removal of misinformation from social media during high impact events, e.g., COVID-19 pandemic, is a sensitive application since the agency in charge of this process must ensure that no unwarranted actions are taken. This suggests that any automated system used for this process must display both high prediction accuracy as well as high explainability. Although Deep Learning methods have shown remarkable prediction accuracy, accessing the contextual information that Deep Learning-based representations carry is a significant challenge. In this paper, we propose a data-driven solution that is based on a popular latent variable model called Independent Component Analysis (ICA), where a slight loss in accuracy with respect to a BERT model is compensated by interpretable contextual representations. Our proposed solution provides direct interpretability without affecting the computational complexity of the model and without designing a separate system. We carry this study on a novel labeled COVID-19 Twitter dataset that is based on socio-linguistic criteria and show that our model’s explanations highly correlate with humans’ reasoning.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Dataset is available at https://zoisboukouvalas.github.io/Code.html.
2.
We thank Dr. Kenton White, Chief Scientist at Advanced Symbolics Inc, for providing the initial Twitter dataset.
3.
It is worth mentioning that for all methods similar results were obtained with the sigmoid and and the rbf kernel.

References

Adalı, T., Anderson, M., Fu, G.S.: Diversity in independent component and vector analyses: identifiability, algorithms, and applications in medical imaging. IEEE Signal Process. Mag. 31(3), 18–33 (2014)
Article Google Scholar
Baly, R., Karadzhov, G., Alexandrov, D., Glass, J., Nakov, P.: Predicting factuality of reporting and bias of news media sources. arXiv preprint arXiv:1810.01765 (2018)
Berry, M.W., Browne, M., Langville, A.N., Pauca, V.P., Plemmons, R.J.: Algorithms and applications for approximate nonnegative matrix factorization. Comput. Stat. Data Anal. 52(1), 155–173 (2007)
Article MathSciNet Google Scholar
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
MATH Google Scholar
Boukouvalas, Z., Levin-Schwartz, Y., Mowakeaa, R., Fu, G.S., Adalı, T.: Independent component analysis using semi-parametric density estimation via entropy maximization. In: 2018 IEEE Statistical Signal Processing Workshop (SSP), pp. 403–407. IEEE (2018)
Google Scholar
Clark, K., Luong, M.T., Le, Q.V., Manning, C.D.: Electra: pre-training text encoders as discriminators rather than generators. arXiv preprint arXiv:2003.10555 (2020)
Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. CoRR abs/1810.04805 (2018). http://arxiv.org/abs/1810.04805
Asr, F.T.: The language gives it away: How an algorithm can help us detect fake news (2019). https://theconversation.com/the-language-gives-it-away-howan-algorithm-can-help-us-detectfake-news-120199, online
Gupta, A., Kumaraguru, P.: Credibility ranking of tweets during high impact events. In: Proceedings of the 1st Workshop on Privacy and Security in Online Social Media, p. 2. ACM (2012)
Google Scholar
Hansen, L.K., Rieger, L.: Interpretability in intelligent systems – a new concept? In: Samek, W., Montavon, G., Vedaldi, A., Hansen, L.K., Müller, K.-R. (eds.) Explainable AI: Interpreting, Explaining and Visualizing Deep Learning. LNCS (LNAI), vol. 11700, pp. 41–49. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-28954-6_3
Chapter Google Scholar
Horne, B.D., Adali, S.: This just. In: fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news. In: Eleventh International AAAI Conference on Web and Social Media (2017)
Google Scholar
Islam, M.R., Liu, S., Wang, X., Xu, G.: Deep learning for misinformation detection on online social networks: a survey and new perspectives. Social Netw. Anal. Min. 10(1), 1–20 (2020). https://doi.org/10.1007/s13278-020-00696-x
Article Google Scholar
Li, X., Adali, T.: Independent component analysis by entropy bound minimization. IEEE Trans. Signal Process. 58(10), 5151–5164 (2010)
Article MathSciNet Google Scholar
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
Perez-Rosas, V., Kleinberg, B., Lefevre, A., Mihalcea, R.: Automatic detection of fake news. arXiv preprint arXiv:1708.07104 (2017)
Rashkin, H., Choi, E., Jang, J.Y., Volkova, S., Choi, Y.: Truth of varying shades: Analyzing language in fake news and political fact-checking. In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pp. 2931–2937 (2017)
Google Scholar
Reimers, N., Gurevych, I.: Sentence-bert: sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019)
Ribeiro, M.T., Singh, S., Guestrin, C.: “Why should i trust you?": explaining the predictions of any classifier (2016). http://arxiv.org/abs/1602.04938
Tošić, I., Frossard, P.: Dictionary learning. IEEE Signal Process. Mag. 28(2), 27–38 (2011)
Article Google Scholar
White, K., Li, G., Japkowicz, N.: Sampling online social networks using coupling from the past. In: 2012 IEEE 12th International Conference on Data Mining Workshops, pp. 266–272. IEEE (2012)
Google Scholar
Wu, L., Morstatter, F., Carley, K.M., Liu, H.: Misinformation in social media: definition, manipulation, and detection. ACM SIGKDD Explor. Newsl 21(2), 80–90 (2019)
Article Google Scholar

Download references

Author information

Authors and Affiliations

American University, Washington, D.C., 20016, USA
Caitlin Moroney, Nathalie Japkowicz & Zois Boukouvalas
University of Ottawa, Ottawa, ON, Canada
Evan Crothers
Mississippi State University, Mississippi State, Starkville, MS, 39762, USA
Sudip Mittal
University of Maryland, Baltimore County, Baltimore, MD, 21250, USA
Anupam Joshi, Tülay Adalı & Christine Mallinson

Authors

Caitlin Moroney
View author publications
You can also search for this author in PubMed Google Scholar
Evan Crothers
View author publications
You can also search for this author in PubMed Google Scholar
Sudip Mittal
View author publications
You can also search for this author in PubMed Google Scholar
Anupam Joshi
View author publications
You can also search for this author in PubMed Google Scholar
Tülay Adalı
View author publications
You can also search for this author in PubMed Google Scholar
Christine Mallinson
View author publications
You can also search for this author in PubMed Google Scholar
Nathalie Japkowicz
View author publications
You can also search for this author in PubMed Google Scholar
Zois Boukouvalas
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zois Boukouvalas .

Editor information

Editors and Affiliations

Universidade do Porto and Fraunhofer Portugal AICOS, Porto, Portugal
Carlos Soares
Dalhousie University, Halifax, NS, Canada
Luis Torgo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Moroney, C. et al. (2021). The Case for Latent Variable Vs Deep Learning Methods in Misinformation Detection: An Application to COVID-19. In: Soares, C., Torgo, L. (eds) Discovery Science. DS 2021. Lecture Notes in Computer Science(), vol 12986. Springer, Cham. https://doi.org/10.1007/978-3-030-88942-5_33

Download citation

DOI: https://doi.org/10.1007/978-3-030-88942-5_33
Published: 09 October 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-88941-8
Online ISBN: 978-3-030-88942-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics