Automated Fake News Detection Using Computational Forensic Linguistics

Moura, Ricardo; Sousa-Silva, Rui; Lopes Cardoso, Henrique

doi:10.1007/978-3-030-86230-5_62

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12981))

Included in the following conference series:

EPIA Conference on Artificial Intelligence

2070 Accesses
3 Citations

Abstract

Fake news is news-like content that has been produced without following journalism principles. Fake news try to mimic the look and feel of real news to intentionally disinform the reader. This phenomenon can have a strong influence on society, thus being potentially a severe problem. To address this phenomenon, systems to detect fake news have been developed, but most of them build upon fact-checking approaches, which are unfit to detect misinformation when a news piece, rather than completely false, is distorted, exaggerated, or even decontextualized. We aim to detect Portuguese fake news by following a forensic linguistics approach. Contrary to previous approaches, we build upon methods of linguistic and stylistic analysis that have been tried and tested in forensic linguists. After collecting corpora from multiple fake news outlets and from a genuine news source, we formulate the task as a text classification problem and demonstrate the effectiveness of the proposed features when training different classifiers for telling fake from genuine news. Furthermore, we perform an ablation study with subsets of features and find that the proposed feature sets are complementary. The highest results reported are very promising, achieving 97% of accuracy and a macro F1-score of 91%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 109.00; Price excludes VAT (USA)

Softcover Book: USD 139.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Ahmed, H., Traore, I., Saad, S.: Detection of online fake news using n-gram analysis and machine learning techniques. In: Traore, I., Woungang, I., Awad, A. (eds.) ISDDC 2017. LNCS, vol. 10618, pp. 127–138. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-69155-8_9
Chapter Google Scholar
Alkhodair, S.A., Ding, S.H., Fung, B.C., Liu, J.: Detecting breaking news rumors of emerging topics in social media. Inf. Process. Manage. 57, 102018 (2020)
Google Scholar
Bender, J., Davenport, L., Fedler, F., Drager, M.: Reporting for the Media. Oxford University Press, Oxford (2012)
Google Scholar
Browne, R.: ‘Junk news’ gets massive engagement on Facebook ahead of EU elections, study finds. CNBC (2019). https://www.cnbc.com/2019/05/21/junk-news-gets-higher-engagement-on-facebook-ahead-of-eu-elections.html. Accessed 19 Apr 2021
Chowdhury, M.F.M., Lavelli, A.: Assessing the practical usability of an automatically annotated corpus. In: Proceedings of the 5th Linguistic Annotation Workshop, pp. 101–109. Association for Computational Linguistics, Portland, Oregon, USA, Jun 2011. https://www.aclweb.org/anthology/W11-0412
Cruz, A., Rocha, G., Sousa-Silva, R., Lopes Cardoso, H.: Team Fernando-Pessa at SemEval-2019 task 4: Back to basics in hyperpartisan news detection. In: Proceedings of the 13th International Workshop on Semantic Evaluation, pp. 999–1003. Association for Computational Linguistics, Minneapolis, Minnesota, USA, Jun 2019. https://doi.org/10.18653/v1/S19-2173
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding (2019)
Google Scholar
Álvaro Figueira, Oliveira, L.: The current state of fake news: challenges and opportunities. Procedia Computer Science (2017). https://doi.org/10.1016/j.procs.2017.11.106
Flesch, R.: A new readability yardstick. J. Appl. Psychol. 32(3), 221–233 (1948)
Google Scholar
Gunning, R.: The Technique of Clear Writing. McGraw-Hill, New York (1952)
Google Scholar
Hahn, U., Tomanek, K., Beisswanger, E., Faessler, E.: A proposal for a configurable silver standard. In: Proceedings of the Fourth Linguistic Annotation Workshop, pp. 235–242. Association for Computational Linguistics, Uppsala, Sweden, July 2010. https://www.aclweb.org/anthology/W10-1838
Hancock, J.T., Curry, L.E., Goorha, S., Woodworth, M.: On lying and being lied to: A linguistic analysis of deception in computer-mediated communication. Discourse Process. 45 (2007). https://doi.org/10.1080/01638530701739181
Harrower, T.: Inside Reporting: A Practical Guide to the Craft of Journalism. McGraw-Hill Companies, Incorporated (2007)
Google Scholar
Horne, B.D., Adali, S.: This just. In: Fake news packs a lot in title, uses simpler, repetitive content in text body, more similar to satire than real news (2017)
Google Scholar
Kincaid, J.P., Aagard, J.A., O’Hara, J.W.: Development and test of a computer readability editing system (CRES). Technical report, TRAINING ANALYSIS AND EVALUATION GROUP (NAVY) ORLANDO FL (1980)
Google Scholar
Laboreiro, G., Oliveira, E.: What we can learn from looking at profanity, pp. 108–113 (2014). https://doi.org/10.1007/978-3-319-09761-9_11
Laughlin, G.H.M.: Smog grading-a new readability formula. J. Reading 12(8), 639–646 (1969). http://www.jstor.org/stable/40011226
Litvinova, O., Seredin, P., Litvinova, T., Lyell, J.: Deception detection in Russian texts. In: Proceedings of the Student Research Workshop at the 15th Conference of the European Chapter of the Association for Computational Linguistics (2017)
Google Scholar
Mehrabi, N., Morstatter, F., Saxena, N., Lerman, K., Galstyan, A.: A survey on bias and fairness in machine learning. arXiv (2019). arXiv:1908.09635
Mourão, R.R., Robertson, C.T.: Fake news as discursive integration: an analysis of sites that publish false, misleading, hyperpartisan and sensational information. Journalism Stud. 20(14), 2077–2095 (2019). https://doi.org/10.1080/1461670X.2019.1566871
Pérez-Rosas, V., Kleinberg, B., Lefevre, A., Mihalcea, R.: Automatic detection of fake news. arXiv preprint arXiv:1708.07104 (2017)
Sousa-Silva, R.: Computational forensic linguistics: an overview of computational applications in forensic contexts. Language and Law/Linguagem e Direito 5(2), 118–143 (2019)
Google Scholar
Sullivan, M.: What it really means when trump calls a story ‘fake news’. https://www.washingtonpost.com/lifestyle/media/what-it-really-means-when-trump-calls-a-story-fake-news/2020/04/13/56fbe2c0-7d8c-11ea-9040-68981f488eed_story.html (2020). Accessed 20 Apr 2021
Tandoc, E., Lim, Z., Ling, R.: Defining “fake news”: a typology of scholarly definitions. Digital Journalism 6 (2017). https://doi.org/10.1080/21670811.2017.1360143
Vorhaus, M.: People increasingly turn to social media for news. https://www.forbes.com/sites/mikevorhaus/2020/06/24/people-increasingly-turn-to-social-media-for-news/ (2020). Accessed 5 Apr 2021
Weber, G.: Top languages. The World’s 10 (2008)
Google Scholar

Download references

Acknowledgments

This research is supported by project DARGMINTS (POCI/01/0145/FEDER/031460), CLUP (UIDB/00022/2020), and LIACC (FCT/UID/CEC/0027/2020), funded by Fundação para a Ciência e a Tecnologia (FCT).

Author information

Authors and Affiliations

Faculdade de Engenharia, Universidade do Porto, Porto, Portugal
Ricardo Moura & Henrique Lopes Cardoso
Faculdade de Letras, Universidade do Porto, Porto, Portugal
Rui Sousa-Silva
Centro de Linguística da Universidade do Porto (CLUP), Porto, Portugal
Rui Sousa-Silva
Laboratório de Inteligência Artificial e Ciência de Computadores (LIACC), Porto, Portugal
Henrique Lopes Cardoso

Authors

Ricardo Moura
View author publications
You can also search for this author in PubMed Google Scholar
Rui Sousa-Silva
View author publications
You can also search for this author in PubMed Google Scholar
Henrique Lopes Cardoso
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Henrique Lopes Cardoso .

Editor information

Editors and Affiliations

ISEP/GECAD, Polytechnic Institute of Porto, Porto, Portugal
Goreti Marreiros
IST/INESC-ID, University of Lisbon, Porto Salvo, Portugal
Francisco S. Melo
DETI/IEETA, University of Aveiro, Aveiro, Portugal
Nuno Lau
FEUP/LIACC, University of Porto, Porto, Portugal
Henrique Lopes Cardoso
FEUP/LIACC, University of Porto, Porto, Portugal
Luís Paulo Reis

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Moura, R., Sousa-Silva, R., Lopes Cardoso, H. (2021). Automated Fake News Detection Using Computational Forensic Linguistics. In: Marreiros, G., Melo, F.S., Lau, N., Lopes Cardoso, H., Reis, L.P. (eds) Progress in Artificial Intelligence. EPIA 2021. Lecture Notes in Computer Science(), vol 12981. Springer, Cham. https://doi.org/10.1007/978-3-030-86230-5_62

Download citation

DOI: https://doi.org/10.1007/978-3-030-86230-5_62
Published: 03 September 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86229-9
Online ISBN: 978-3-030-86230-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics